2ndQuadrant is now part of EDB

Bringing together some of the world's top PostgreSQL experts.

2ndQuadrant | PostgreSQL
Mission Critical Databases
  • Contact us
  • EN
    • FR
    • IT
    • ES
    • DE
    • PT
  • Support & Services
  • Products
  • Downloads
    • Installers
      • Postgres Installer
      • 2UDA – Unified Data Analytics
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
  • Postgres Learning Center
    • Webinars
      • Upcoming Webinars
      • Webinar Library
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Blog
    • Training
      • Course Catalogue
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
    • Books
      • PostgreSQL 11 Administration Cookbook
      • PostgreSQL 10 Administration Cookbook
      • PostgreSQL High Availability Cookbook – 2nd Edition
      • PostgreSQL 9 Administration Cookbook – 3rd Edition
      • PostgreSQL Server Programming Cookbook – 2nd Edition
      • PostgreSQL 9 Cookbook – Chinese Edition
    • Videos
    • Events
    • PostgreSQL
      • PostgreSQL – History
      • Who uses PostgreSQL?
      • PostgreSQL FAQ
      • PostgreSQL vs MySQL
      • The Business Case for PostgreSQL
      • Security Information
      • Documentation
  • About Us
    • About 2ndQuadrant
    • 2ndQuadrant’s Passion for PostgreSQL
    • News
    • Careers
    • Team Profile
  • Blog
  • Menu Menu
You are here: Home1 / Blog2 / Greenplum3 / External web tables in Greenplum
Carlo Ascani

External web tables in Greenplum

April 10, 2012/0 Comments/in Greenplum /by Carlo Ascani

**External web tables** are one of the most useful features when you you have to load
data into a Greenplum database from different sources.


## What is an external table?
External web tables are a special type of external tables.
The keyword *web* means that they are able to access dynamic data, and they can show it to you as if they were regular database tables.
Given that data could change during every single query execution involving the web table, the Greenplum planner must avoid to choose plans that perform a complete re-scanning of the whole table.
## How to use them
There are two different types of web tables: Web URLs based and OS Command based.
With Web URL ones it is possible to create an external web table that loads data from
files on a web server. The data transmission relies on the HTTP protocol.
With the OS Command mode you can specify a shell script (or a command) to execute on any number of your cluster’s segments. The output of your script will be the data of the web table at the time of access. The command is executed in parallel on all segments by default, but you can limit that to the master node for instance.
In this article I have used the OS command mode.
## A practical scenario
Say we want to load PostgreSQL data into Greenplum using web tables.
First of all, we need to write a script that outputs the desired data.
To achieve that, we will use the power of the COPY command.
An example script would be:

#!/bin/bash
COMMAND="COPY table_name(field) TO STDOUT WITH (FORMAT CSV, DELIMITER '|', HEADER)"
psql -U user -d database -c "$COMMAND"

This prints the content of the *field* column of the *table_name* table on *standard output* in CSV format. In this example we will run the script on the master. Theoretically, you could copy the script on all segments, each collecting data from partitioned tables in parallel.
**IMPORTANT:** the script must be executable by the gpadmin user.
Now it is possible to create the corresponding external web table in Greenplum:

# CREATE EXTERNAL WEB TABLE ext_table ( field TEXT )
EXECUTE '/path/to/script.sh'
ON MASTER
FORMAT 'CSV';

Now you are ready to concretise the external data in a database table.
For that, you can use either an INSERT INTO table SELECT * FROM ext\_table
or a CREATE TABLE table AS SELECT * FROM ext\_table
## Conclusions
As you can imagine, this is an extremely powerful feature.
You can write a script to connect to every other DBMS and dump data or
produce live data with any programming language (for instance through RSS or Atom or XML feeds).
Feeding tables with OS script is one of my favourite features,
and I think the only limitation you have here is your imagination.

Tags: external tables, greenplum, postgres, web tables
Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Get in touch with us!

Recent Posts

  • Random Data December 3, 2020
  • Webinar: COMMIT Without Fear – The Beauty of CAMO [Follow Up] November 13, 2020
  • Full-text search since PostgreSQL 8.3 November 5, 2020
  • Random numbers November 3, 2020
  • Webinar: Best Practices for Bulk Data Loading in PostgreSQL [Follow Up] November 2, 2020

Featured External Blogs

Tomas Vondra's Blog

Our Bloggers

  • Simon Riggs
  • Alvaro Herrera
  • Andrew Dunstan
  • Craig Ringer
  • Francesco Canovai
  • Gabriele Bartolini
  • Giulio Calacoci
  • Ian Barwick
  • Marco Nenciarini
  • Mark Wong
  • Pavan Deolasee
  • Petr Jelinek
  • Shaun Thomas
  • Tomas Vondra
  • Umair Shahid

PostgreSQL Cloud

2QLovesPG 2UDA 9.6 backup Barman BDR Business Continuity community conference database DBA development devops disaster recovery greenplum Hot Standby JSON JSONB logical replication monitoring OmniDB open source Orange performance PG12 pgbarman pglogical PG Phriday postgres Postgres-BDR postgres-xl PostgreSQL PostgreSQL 9.6 PostgreSQL10 PostgreSQL11 PostgreSQL 11 PostgreSQL 11 New Features postgresql repmgr Recovery replication security sql wal webinar webinars

Support & Services

24/7 Production Support

Developer Support

Remote DBA for PostgreSQL

PostgreSQL Database Monitoring

PostgreSQL Health Check

PostgreSQL Performance Tuning

Database Security Audit

Upgrade PostgreSQL

PostgreSQL Migration Assessment

Migrate from Oracle to PostgreSQL

Products

HA Postgres Clusters

Postgres-BDR®

2ndQPostgres

pglogical

repmgr

Barman

Postgres Cloud Manager

SQL Firewall

Postgres-XL

OmniDB

Postgres Installer

2UDA

Postgres Learning Center

Introducing Postgres

Blog

Webinars

Books

Videos

Training

Case Studies

Events

About Us

About 2ndQuadrant

What does 2ndQuadrant Mean?

News

Careers 

Team Profile

© 2ndQuadrant Ltd. All rights reserved. | Privacy Policy
  • Twitter
  • LinkedIn
  • Facebook
  • Youtube
  • Mail
Setting JDBC with Greenplum Announcing Barman 1.0, Backup and Recovery Manager for PostgreSQL
Scroll to top
×