Performing parallel ETL with Greenplum’s gpfdist and external tables

One of the coolest features that Greenplum offers to Data warehousing and Business Intelligence operators as far as ETL is concerned, is the combination of read only external tables with gpfdist, Greenplum’s parallel file distribution server. The typical use case for this solution is parallel data loading of text files (coming from etherogeneous sources – […]

How to install fuzzystrmatch on Greenplum Community Edition

Some members of the Greenplum Community Forum have been asking about how to install on Greenplum CE a very useful “contrib” module available for PostgreSQL.

Installing Greenplum Single Node Edition on Ubuntu 10.4 (Lucid)

Officially Greenplum Database Single Node Edition (SNE) is only installable on Red Hat Enterprise Linux (RHEL) and SUSE Linux Enteprise Server (SLES), but while surfing the web I have seen many requests on how to install it on Debian/Ubuntu. Here I’m trying to give you some advices.

Installing PostGIS on Greenplum Single Node Edition

One of the main reasons users switch from other relational databases to PostgreSQL is the advanced support for geographic objects included in the PostGIS extension. Being PostgreSQL specialists at 2ndQuadrant, we have tried to investigate if it was possible (and how) to install PostGIS on the Greenplum Single Node edition. Let’s see how Marco Nenciarini, […]

Installing Greenplum Single Node Edition on Amazon’s EC2

I have been thinking for a while now about adding Greenplum support to an open-source application for web analytics that I wrote a few years ago, which is called htMiner and uses PostgreSQL. In order to do this, I need a multi-CPU environment. While still waiting to get our new servers installed here in our […]