I’m going to demonstrate how it is possible to use dblink in Greenplum 18.104.22.168
dblink is a PostgreSQL contrib module that allows to execute queries on another database.
Current PostgreSQL’s architeture requires users to connect to a specific database on a server. Therefore, it is not possible to natively perform an SQL query on a different database.
Normally we discourage using
dblink to query remote databases.
Databases are self-contained objects in PostgreSQL, and they should be designed with that in mind.
We always advice users to work with schemas. But sometimes, this might not be enough.
Anyway, I’m writing this article to show you how PostgreSQL and Greenplum are related.
You need a working copy of Greenplum 4.0.4 and a directory containing the PostgreSQL 8.2 source code, I’ve tested it on a Linux operating system (CentOS 5 to be exact).
For this example, I assume that Greenplum is installed in
/usr/local/greenplum-db-22.214.171.124 and PostgreSQL source directory is
I’m using PostgreSQL 8.2 because Greenplum is based on fork on that version.
You can obtain PostgreSQL 8.2 source from
http://www.postgresql.org/ftp/source/v8.2.22/ or directly from git, please refer to
http://wiki.postgresql.org/wiki/Working\_with\_git for instruction on how to use git with PostgreSQL.
Installing dblink is as easy as installing a contrib module in PostgreSQL.
gpadmin user, source the
$ source /usr/local/greenplum-db-126.96.36.199/greenplum_path.sh
You should have this line in your .bashrc file after the installation process.
* Check that
pg_config --pgxs is the Greenplum one:
$ pg_config --pgxs /usr/local/greenplum-db-188.8.131.52/lib/postgresql/pgxs/src/makefiles/pgxs.mk
* Enter the dblink directory:
$ cd /home/gpadmin/postgresql-8.2.22/contrib/dblink/
* Edit the
Makefile and add
PG_CPPFLAGS, so that line number four looks like this:
PG_CPPFLAGS = -I$(libpq_srcdir) -w
This option tells
gcc to ignore all warnings (that’s a _dirty_ hack but mandatory).
* Install the module directly in a database:
$ createdb my_db $ make USE_PGXS=1 install $ psql -f dblink.sql my_database
At this time,
dblink, so it is able to execute queries in remote databases.
It is now time to test it, executing some queries.
* Create a database to query, with an example table:
$ createdb my_other_db $ psql -c "CREATE TABLE t AS SELECT generate_series(1,1000) AS v" my_other_db
* Create the dblink connection to that database:
$ psql my_db my_db=# SELECT dblink_connect('myconn', 'dbname=my_other_db'); dblink_connect ------------------- OK (1 row)
* Execute queries using that connection:
$ psql my_db my_db=# SELECT * FROM dblink( 'myconn', 'SELECT v FROM t') AS t1 ( v INTEGER ) LIMIT 5;
Please note that you must specify the expected set of columns in the calling query, because
dblink returns a set of
record (to be generic).
dblink would not be my preferred choice (also because it is not directly supported by Greenplum – or for Greenplum), this article shows you a simple method to use it in those cases where you can’t really do without it.