2ndQuadrant is now part of EDB

Bringing together some of the world's top PostgreSQL experts.

2ndQuadrant | PostgreSQL
Mission Critical Databases
  • Contact us
  • EN
    • FR
    • IT
    • ES
    • DE
    • PT
  • Support & Services
  • Products
  • Downloads
    • Installers
      • Postgres Installer
      • 2UDA – Unified Data Analytics
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
  • Postgres Learning Center
    • Webinars
      • Upcoming Webinars
      • Webinar Library
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Blog
    • Training
      • Course Catalogue
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
    • Books
      • PostgreSQL 11 Administration Cookbook
      • PostgreSQL 10 Administration Cookbook
      • PostgreSQL High Availability Cookbook – 2nd Edition
      • PostgreSQL 9 Administration Cookbook – 3rd Edition
      • PostgreSQL Server Programming Cookbook – 2nd Edition
      • PostgreSQL 9 Cookbook – Chinese Edition
    • Videos
    • Events
    • PostgreSQL
      • PostgreSQL – History
      • Who uses PostgreSQL?
      • PostgreSQL FAQ
      • PostgreSQL vs MySQL
      • The Business Case for PostgreSQL
      • Security Information
      • Documentation
  • About Us
    • About 2ndQuadrant
    • 2ndQuadrant’s Passion for PostgreSQL
    • News
    • Careers
    • Team Profile
  • Blog
  • Menu Menu
You are here: Home1 / Blog2 / Greenplum3 / ETL with Talend and Greenplum – Part two: data import
Giulio Calacoci

ETL with Talend and Greenplum – Part two: data import

September 19, 2011/0 Comments/in Greenplum /by Giulio Calacoci

In the first part of this tutorial, we have set up all the connections required for creating the job, now we can proceed with data import.

Let’s drag and drop inside the visual editor an object named tMap. You can find it on the left, in the instruments palette, inside the “elaboration” folder.

Now, we need to connect the “states” CSV object with the tMap element (right-Click on the CSV element -> rows -> main ) then connect the tMap element with the destination table ( right-click rows -> new output ). Once the three elements are connected we need to access the tMap object in order to edit the fields associations.
By dragging the fields from the left table to one on the right, it is possible to associate every field of the CSV with the destination column on the target database table. Clicking on “Ok” the field association will be saved, and the data from the CSV file are ready to be imported.
Now it is time to add to the job the import for the users table.
It’s important to remember that the users table have a one to many relationship with the states (one state can have many users, one user can have only one state) , so during the import we need to perform a lookup on the states table to be sure to maintain the relationship between the two tables. With “lookup” we refer to searching a value inside a dictionary, using a key. The goal is to retrieve an ID (usually the primary key of an object in the database) to be stored in the “many” relationship and to maintain the referential integrity.
Add a tMap object between the CSV file containing the users and the destination table. Also, from the list of the tables in the database, drag a “states” table object inside the visual editor, using the tGreenplumInput type.
Connect the three elements to the tMap element, as before.
Now it’s time to map the elements. This time the idState field from the CSV will be mapped to the idState column from the “states” table (called row3 in the example image) and this field will be mapped to the “users” table. Doing so for every row of the CSV file, will trigger a  check on the states table. The correct ID will then be assigned to the destination row on the table “users”.
As a final step, we need to add a conditional link between the two subjobs (the stats import, and the users import, lookup included). This is because the second job can be successful only if the first one completely imports the data in the states table.
To obtain this, right click on the output table “states”, select the “Trigger” option, and then the  “onComponentOk”. Link the “states” component to the users input CSV file. You will get something like this:
The job is ready to be executed. Simply search the “Run” tab on the lower part of the screen and click on the run button. The import operation will be executed and data will be imported.
Even though the operations showed in this two-part article are quite simple, you can repeat them for all the tables (dimensions and facts) of your data warehouse.
For more information, do not hesitate to contact us. In the next weeks you will see more articles about Greenplum and ETL tools in our blog.
Tags: etl, greenplum, talend open studio
Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Get in touch with us!

Recent Posts

  • Random Data December 3, 2020
  • Webinar: COMMIT Without Fear – The Beauty of CAMO [Follow Up] November 13, 2020
  • Full-text search since PostgreSQL 8.3 November 5, 2020
  • Random numbers November 3, 2020
  • Webinar: Best Practices for Bulk Data Loading in PostgreSQL [Follow Up] November 2, 2020

Featured External Blogs

Tomas Vondra's Blog

Our Bloggers

  • Simon Riggs
  • Alvaro Herrera
  • Andrew Dunstan
  • Craig Ringer
  • Francesco Canovai
  • Gabriele Bartolini
  • Giulio Calacoci
  • Ian Barwick
  • Marco Nenciarini
  • Mark Wong
  • Pavan Deolasee
  • Petr Jelinek
  • Shaun Thomas
  • Tomas Vondra
  • Umair Shahid

PostgreSQL Cloud

2QLovesPG 2UDA 9.6 backup Barman BDR Business Continuity community conference database DBA development devops disaster recovery greenplum Hot Standby JSON JSONB logical replication monitoring OmniDB open source Orange performance PG12 pgbarman pglogical PG Phriday postgres Postgres-BDR postgres-xl PostgreSQL PostgreSQL 9.6 PostgreSQL10 PostgreSQL11 PostgreSQL 11 PostgreSQL 11 New Features postgresql repmgr Recovery replication security sql wal webinar webinars

Support & Services

24/7 Production Support

Developer Support

Remote DBA for PostgreSQL

PostgreSQL Database Monitoring

PostgreSQL Health Check

PostgreSQL Performance Tuning

Database Security Audit

Upgrade PostgreSQL

PostgreSQL Migration Assessment

Migrate from Oracle to PostgreSQL

Products

HA Postgres Clusters

Postgres-BDR®

2ndQPostgres

pglogical

repmgr

Barman

Postgres Cloud Manager

SQL Firewall

Postgres-XL

OmniDB

Postgres Installer

2UDA

Postgres Learning Center

Introducing Postgres

Blog

Webinars

Books

Videos

Training

Case Studies

Events

About Us

About 2ndQuadrant

What does 2ndQuadrant Mean?

News

Careers 

Team Profile

© 2ndQuadrant Ltd. All rights reserved. | Privacy Policy
  • Twitter
  • LinkedIn
  • Facebook
  • Youtube
  • Mail
Using PL/Java in Greenplum Early bird registrations open for PGDay.IT 2011
Scroll to top
×