hen working with databases, one of the most common task is to load data from one or more CSV files. Several tools are available to achieve this task. Some are executed via command line, like COPY (using psql), some are more complex, like ETL systems. We will start today with Talend but, in the next weeks, […]
Mapreduce in Greenplum 4.1
/0 Comments/in Greenplum /by Carlo AscaniMapreduce is a very trendy software framework. It has been introduced by Google (TM) in 2004. It is a large topic, and it is not possible to cover all of its aspetcs in a single blog article. This is a simple introduction to the _mapreduce_ usage in Greenplum 4.1.
ETL with Kettle and Greenplum – Part Two: importing data
/0 Comments/in Greenplum /by Giulio CalacociIn the first part of this article we have created a job, a database connection and defined the flow in Kettle. In the second part we’ll see how Kettle manages the data import from the CSV files.
Using gpmigrator in Greenplum 4.1.1
/0 Comments/in Greenplum /by Carlo AscaniIn this article, I am going to upgrade a Greenplum cluster from version 4.0 to 4.1 using `gpmigrator`. `gpmigrator` is an utility shipped with Greenplum Community Edition whose purpose is to perform a live upgrade of an existing database.
Call for papers for PGDay.IT 2011 has been extended
/0 Comments/in Gabriele's PlanetPostgreSQL, PostgreSQL /by Gabriele BartoliniThe Call for Papers for the Italian PGDay has been extended of a week. The new deadline for submitting a paper is October 23.
ETL with Kettle and Greenplum – Part one: setting up your job.
/0 Comments/in Greenplum /by Giulio CalacociRecently I have shown you how to perform a data import from a CSV file into a Greenplum database, using Talend Community Edition. In this article I’m going to perform the same task using another ETL tool, Kettle.
Using dblink in Greenplum
/1 Comment/in Greenplum /by Carlo AscaniI’m going to demonstrate how it is possible to use dblink in Greenplum 4.0.4.0
Early bird registrations open for PGDay.IT 2011
/0 Comments/in Gabriele's PlanetPostgreSQL, PostgreSQL /by Gabriele BartoliniThe Italian PGDay 2011 will take place in Prato, on Friday November 25th, at the Monash University Prato Centre. Exactly, where it all started.The event, organised by the Italian PostgreSQL Users Group, will be a great chance for both Italian and European members of the PostgreSQL community to gather together and to promote PostgreSQL.
ETL with Talend and Greenplum – Part two: data import
/0 Comments/in Greenplum /by Giulio CalacociIn the first part of this tutorial, we have set up all the connections required for creating the job, now we can proceed with data import. Let’s drag and drop inside the visual editor an object named tMap. You can find it on the left, in the instruments palette, inside the “elaboration” folder.
Using PL/Java in Greenplum
/1 Comment/in Greenplum /by Carlo AscaniIn this article we are going to show you how to write PL/Java functions in Greenplum. I assume that you have a working Greenplum (or Greenplum Community Edition) at your disposal. In this example we will use version **4.0.4**, installed in /usr/local/greenplum-db-4.0.4.0 (which is the default location).
ETL with Talend and Greenplum – Part one: connections
/2 Comments/in Greenplum /by Giulio Calacocihen working with databases, one of the most common task is to load data from one or more CSV files. Several tools are available to achieve this task. Some are executed via command line, like COPY (using psql), some are more complex, like ETL systems. We will start today with Talend but, in the next weeks, […]