Scenario: We have a remote datasource, served by a gpfdist server. We need to import the data in a Greenplum database, while performing some ETL manipulation during the import. It is possible to accomplish this goal with a simple transformation in a few steps using Kettle.
One of the coolest features that Greenplum offers to Data warehousing and Business Intelligence operators as far as ETL is concerned, is the combination of read only external tables with gpfdist, Greenplum’s parallel file distribution server. The typical use case for this solution is parallel data loading of text files (coming from etherogeneous sources – […]