2ndQuadrant is now part of EDB

Bringing together some of the world's top PostgreSQL experts.

2ndQuadrant | PostgreSQL
Mission Critical Databases
  • Contact us
  • EN
    • FR
    • IT
    • ES
    • DE
    • PT
  • Support & Services
  • Products
  • Downloads
    • Installers
      • Postgres Installer
      • 2UDA – Unified Data Analytics
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
  • Postgres Learning Center
    • Webinars
      • Upcoming Webinars
      • Webinar Library
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Blog
    • Training
      • Course Catalogue
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
    • Books
      • PostgreSQL 11 Administration Cookbook
      • PostgreSQL 10 Administration Cookbook
      • PostgreSQL High Availability Cookbook – 2nd Edition
      • PostgreSQL 9 Administration Cookbook – 3rd Edition
      • PostgreSQL Server Programming Cookbook – 2nd Edition
      • PostgreSQL 9 Cookbook – Chinese Edition
    • Videos
    • Events
    • PostgreSQL
      • PostgreSQL – History
      • Who uses PostgreSQL?
      • PostgreSQL FAQ
      • PostgreSQL vs MySQL
      • The Business Case for PostgreSQL
      • Security Information
      • Documentation
  • About Us
    • About 2ndQuadrant
    • 2ndQuadrant’s Passion for PostgreSQL
    • News
    • Careers
    • Team Profile
  • Blog
  • Menu Menu
You are here: Home1 / Products2 / pgpredict – Predictive Analytics for PostgreSQL3 / pgpredict – Tutorial

pgpredict – Tutorial

The tutorial below assumes you have already downloaded the pgpredict package from its homepage.

The directory ‘pgpredict/examples’ contains small code snippets that can be used to test pgpredict along with the short tutorial below.

For more information about how to use Orange and PostgreSQL together, please consult the 2UDA homepage, where you can also find a more general tutorial for predictive analytics.

Generating Data

To generate a table for testing purposes use the ‘generate_customers.sql’ script.

The number of rows generated can be increased for a Big Data test if enough disk space is available.

Predictive Analytics in Orange

1) Open Orange and load the ‘customers’ data table with the SQL Table widget.

2) Explore the data (e.g. Box Plot, Distributions, Scatter Map, Scatter Plot, …).

3) To build predictive models, Orange relies heavily on scikit-learn, which works with in-memory data. So if your data table has many rows, sample and download the data first using the Data Sampler widget.

4) Set the input (‘age’, ‘wage’, ‘visits’) and target (‘spent’) variables in the Select Columns widget.

The ‘id’ variable should be moved to Meta Attributes so it is not used in modeling.

Remove the ‘premium’ variable for now. It can be chosen as a categorical target variable later, for a classification instead of a regression demo.

5) Build a predictive model with the Linear Regression widget.

6) Optionally, the model can be evaluated with the Test & Score widget. Compare the error of the Linear Regression model to that of Mean learner, which predicts the average value for all customers.

7) Export the predictive model to a file with the Export Model widget.

Deploying the Predictive Model in PostgreSQL

Make sure to execute ‘pgpredict/predictive_analytics.sql’ before first use, to create the schema “orange” and the necessary functions.

The script ‘demo_customers.sql’ shows how to load the exported model in a PostgreSQL database and make real-time or batch predictions.

The main steps are:

1) Loading the model (provide the model name, version, and filename):

SELECT orange.load_model('customers_spent', '1', 'customers_spent.json');

Because the postgres server process needs access to the model file, make sure the file has appropriate read permissions.

2) Make a prediction based on the input variables:

SELECT spent, 
       orange.predict_numeric(ARRAY[age, wage, visits], 'customers_spent')
  FROM customers 
 WHERE id = 42;

      spent       | predict_numeric 
------------------+-----------------
 141.406325643697 |         127.707

While the actual numbers obtained will vary because of random data generation, the actual value of ‘spent’ and the prediction should be relatively close.

3) Make batch predictions for a set of customers, and compare the predicted and actual values to compute the mean absolute error (MAE):

SELECT avg(abs(spent - orange.predict_numeric(ARRAY[age, wage, visits], 'customers_spent')))
  FROM customers 
 WHERE id < 100000;

       avg        
------------------
 16.7719005959281

Recent Posts

  • Random Data December 3, 2020
  • Webinar: COMMIT Without Fear – The Beauty of CAMO [Follow Up] November 13, 2020
  • Full-text search since PostgreSQL 8.3 November 5, 2020
  • Random numbers November 3, 2020
  • Webinar: Best Practices for Bulk Data Loading in PostgreSQL [Follow Up] November 2, 2020

Support & Services

24/7 Production Support

Developer Support

Remote DBA for PostgreSQL

PostgreSQL Database Monitoring

PostgreSQL Health Check

PostgreSQL Performance Tuning

Database Security Audit

Upgrade PostgreSQL

PostgreSQL Migration Assessment

Migrate from Oracle to PostgreSQL

Products

HA Postgres Clusters

Postgres-BDR®

2ndQPostgres

pglogical

repmgr

Barman

Postgres Cloud Manager

SQL Firewall

Postgres-XL

OmniDB

Postgres Installer

2UDA

Postgres Learning Center

Introducing Postgres

Blog

Webinars

Books

Videos

Training

Case Studies

Events

About Us

About 2ndQuadrant

What does 2ndQuadrant Mean?

News

Careers 

Team Profile

© 2ndQuadrant Ltd. All rights reserved. | Privacy Policy
  • Twitter
  • LinkedIn
  • Facebook
  • Youtube
  • Mail
Scroll to top
×