2ndQuadrant | PostgreSQL
PostgreSQL Solutions for the Enterprise
+39 0574 159 3000
  • Contact Us
  • EN
    • FR
    • IT
    • ES
    • DE
  • Support & Services
    • Support
      • 24/7 PostgreSQL Support
      • Developer Support
      • IBM Z Production Support
    • DBA Services
      • Remote DBA
      • Database Monitoring
    • Consulting Services
      • Health Check
      • Performance Tuning
      • Database Security Audit
      • PostgreSQL Upgrade
      • Kubernetes for Postgres and BDR
    • Migration Services
      • Migrate to PostgreSQL
      • Migration Assessment
  • Products
    • PostgreSQL with High Availability
    • BDR
    • 2ndQPostgres
    • pglogical
      • Installation instruction for pglogical
      • Documentation
    • repmgr
    • Barman
    • Postgres Cloud Manager
    • SQL Firewall
    • Postgres-XL
    • OmniDB
    • Postgres Installer
    • 2UDA
  • Downloads
    • Postgres Installer
    • 2UDA – Unified Data Analytics
  • Postgres Learning Center
    • Webinars
      • BDR Overview
    • Whitepapers
      • Highly Available Postgres Clusters
      • AlwaysOn Postgres
      • BDR
      • PostgreSQL Security Best Practices
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • Healthcare Software Solutions (HSS)
        • Navionics
    • Training
      • Training Catalog and Scheduled Courses
        • Advanced Development & Performance
        • Linux for PostgreSQL DBAs
        • BDR
        • PostgreSQL Database Administration
        • PostgreSQL Data Warehousing & Partitioning
        • PostgreSQL for Developers
        • PostgreSQL Immersion
        • PostgreSQL Immersion for Cloud Databases
        • PostgreSQL Security
        • Postgres-XL-10
        • Practical SQL
        • Replication, Backup & Disaster Recovery
        • Introduction to PostgreSQL and Kubernetes
    • Books
      • PostgreSQL 11 Administration Cookbook
      • PostgreSQL 10 Administration Cookbook
      • PostgreSQL High Availability Cookbook – 2nd Edition
      • PostgreSQL 9 Administration Cookbook – 3rd Edition
      • PostgreSQL Server Programming Cookbook – 2nd Edition
      • PostgreSQL 9 Cookbook – Chinese Edition
    • PostgreSQL
      • PostgreSQL – History
      • Who uses PostgreSQL?
      • PostgreSQL FAQ
      • PostgreSQL vs MySQL
      • Business Case for PostgreSQL
      • Security Information
    • Events
    • Blog
  • About Us
    • About 2ndQuadrant
    • What Does “2ndQuadrant” Mean?
    • 2ndQuadrant’s Passion for PostgreSQL
    • Ask Simon
    • News
    • Careers
    • Team Profile
  • Blog
  • Menu
You are here: Home / Blog / 2ndQuadrant / Data Modelling – It’s a lot more than just a diagram
George McGeachie

Data Modelling – It’s a lot more than just a diagram

June 22, 2018/0 Comments/in 2ndQuadrant /by George McGeachie

If the title of this blog post rings a bell with you, perhaps you were at PG Day in Horwood House in 2014, when I stood up for 5 minutes to make the case for data modelling; a data model is much more than just a diagram. I shouldn’t be, but I am often amazed by the way data models (and the tools we use to manage them) are derided as ‘just pretty pictures’ or ‘documentation’. I’m not going to repeat my lightning talk here (watch it yourself if you want to), instead I’m going to talk about Data Vault.

Data Vault (DV) is a technique for building scalable data warehouses. Dan Linstedt describes DV as “a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and star schema. The design is flexible, scalable, consistent and adaptable to the needs of the enterprise. It is a data model that is architected specifically to meet the needs of today’s enterprise data warehouses.”

The key difference between DV and other data warehouse techniques is that the persistent warehouse tables are organised into three types – Hub, Satellite and Link tables. Here’s part of a sample DV model in SAP PowerDesigner

Key to the success of DV in any organisation will be the ability to convert a normalised relational model (or existing star schemas) into a DV model. That’s where a data modelling tool comes in very handy – manually converting a normalised relational model into a Data Vault is straightforward, but it can be time-consuming, tedious, and therefore prone to error. In addition, you need to be able to manage changes effectively, and to migrate data from its original form into the DV tables – data modelling tools allow you to do both effectively and quickly.

Several organisations now provide products to automate the creation and management of data vaults, usually by cooperating with standard data modelling tools. Some organisations have taken advantage of the power and flexibility of tools like SAP PowerDesigner and built their own DV-generation capabilities. Watch this video on YouTube, in which Thierry de Spirlet demonstrates the automated conversion of a relational model into a multi-layer DV warehouse architecture, complete with models of the resulting data movements. There’s a fair amount of work needed to set up the relational model before generating the DV architecture, but that’s worth it in the end, as creating the DV architecture is so much easier afterwards. Thierry also has a White Paper on the topic, and a blog post. Here’s a snippet of the SQL he generated to load one of the DV tables:

Zooming out a level, we can see a “Data Movement” model, showing the transformation tasks need to load data:

All of this was generated in a data modelling tool – the tool provides much more than just the ability to draw pictures. I rest my case.

OK, not quite finished, I’d like to reiterate that a data model is not just a picture  – here’s another sample PostgreSQL Physical Data Model, also from PowerDesigner. This one is reverse-engineered from a real database.

Behind the scenes, the tool is keeping track of all the ways that these things are connected. For example, here’s the Impact and Lineage Analysis for the column Employees.EmployeeID, showing the other things that would be subject to change if we decided to change, for example, the length of the column. In the Data Vault models generated by Thierry, such an analysis would obviously stretch right across several models.

To find out more about Data Vault, take a look at http://danlinstedt.com, and this book on Amazon – Building a Scalable Data Warehouse with Data Vault 2.0. Training courses are available via Genesee Academy, amongst others.

 

* This is a Guest post, the opinions expressed by the guest writer are theirs alone, and do not necessarily reflect the opinions of 2ndQuadrant or any employee thereof. 2ndQuadrant is not responsible for the accuracy of any of the information supplied by the Guest writer.

Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Recent Posts

  • Barman 2.10 – Recovery of partial WAL files December 11, 2019
  • Setting SSL/TLS protocol versions with PostgreSQL 12 November 27, 2019
  • Webinar: Using SSL with PostgreSQL and pgbouncer [Follow Up] November 14, 2019
  • PostgreSQL 12: Implementing K-Nearest Neighbor Space Partitioned Generalized Search Tree Indexes November 5, 2019
  • Webinar: PostgreSQL Partitioning [Follow up] October 28, 2019

Featured External Blogs

Tomas Vondra's Blog

Our Bloggers

  • Simon Riggs
  • Alvaro Herrera
  • Andrew Dunstan
  • Craig Ringer
  • Francesco Canovai
  • Gabriele Bartolini
  • Giulio Calacoci
  • Ian Barwick
  • Marco Nenciarini
  • Mark Wong
  • Pavan Deolasee
  • Petr Jelinek
  • Shaun Thomas
  • Tomas Vondra
  • Umair Shahid

PostgreSQL Cloud

2QLovesPG 9.6 backup Barman BDR Business Continuity community conference database DBA development devops disaster recovery greenplum Hot Standby JSON JSONB kanban logical decoding logical replication monitoring open source performance PG12 pgbarman pgday pglogical PG Phriday postgres Postgres-BDR postgres-xl PostgreSQL PostgreSQL 9.6 PostgreSQL10 PostgreSQL 11 PostgreSQL11 PostgreSQL 11 New Features postgresql repmgr Recovery release replication sql standby wal webinar
UK +44 (0)870 766 7756

US +1 650 378 1218

Support & Services

24/7 Production Support

Developer Support

Remote DBA for PostgreSQL

PostgreSQL Database Monitoring

PostgreSQL Health Check

PostgreSQL Performance Tuning

Database Security Audit

Upgrade PostgreSQL

PostgreSQL Migration Assessment

Migrate from Oracle to PostgreSQL

Products

HA Postgres Clusters

Postgres-BDR

2ndQPostgres

pglogical

repmgr

Barman

Postgres Cloud Manager

SQL Firewall

Postgres-XL

OmniDB

Postgres Installer

2UDA

Postgres Learning Center

Introducing Postgres

Blog

Webinars

Books

Videos

Training

Case Studies

Events

About Us

About 2ndQuadrant

What does 2ndQuadrant Mean?

News

Careers 

Team Profile

©2001-2019 2ndQuadrant Ltd. All rights reserved | Privacy Policy
  • Twitter
  • LinkedIn
  • Facebook
  • Youtube
  • Mail
Keeping our perl code clean Postgres-BDR 3.0 with OmniDB
Scroll to top
×