Performing ETL using Kettle with GPFDIST and GPLOAD

Scenario: We have a remote datasource, served by a gpfdist server. We need to import the data in a Greenplum database, while performing some ETL manipulation during the import. It is possible to accomplish this goal with a simple transformation in a few steps using Kettle.

ETL with Kettle and Greenplum – Part Two: importing data

In the first part of this article we have created a job, a database connection and defined the flow in Kettle. In the second part we’ll see how Kettle manages the data import from the CSV files.  

ETL with Kettle and Greenplum – Part one: setting up your job.

Recently I have shown you how to perform a data import from a CSV file into a Greenplum database, using Talend Community Edition. In this article I’m going to perform the same task using another ETL tool, Kettle.