2ndQuadrant is now part of EDB

Bringing together some of the world's top PostgreSQL experts.

2ndQuadrant | PostgreSQL
Mission Critical Databases
  • Contact us
  • EN
    • FR
    • IT
    • ES
    • DE
    • PT
  • Support & Services
  • Products
  • Downloads
    • Installers
      • Postgres Installer
      • 2UDA – Unified Data Analytics
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
  • Postgres Learning Center
    • Webinars
      • Upcoming Webinars
      • Webinar Library
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Blog
    • Training
      • Course Catalogue
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
    • Books
      • PostgreSQL 11 Administration Cookbook
      • PostgreSQL 10 Administration Cookbook
      • PostgreSQL High Availability Cookbook – 2nd Edition
      • PostgreSQL 9 Administration Cookbook – 3rd Edition
      • PostgreSQL Server Programming Cookbook – 2nd Edition
      • PostgreSQL 9 Cookbook – Chinese Edition
    • Videos
    • Events
    • PostgreSQL
      • PostgreSQL – History
      • Who uses PostgreSQL?
      • PostgreSQL FAQ
      • PostgreSQL vs MySQL
      • The Business Case for PostgreSQL
      • Security Information
      • Documentation
  • About Us
    • About 2ndQuadrant
    • 2ndQuadrant’s Passion for PostgreSQL
    • News
    • Careers
    • Team Profile
  • Blog
  • Menu Menu
You are here: Home1 / Blog2 / Elein’s PlanetPostgreSQL3 / Having Group By Clauses — elein’s GeneralBits
Elein

Having Group By Clauses — elein’s GeneralBits

March 18, 2019/0 Comments/in Elein’s PlanetPostgreSQL, PostgreSQL /by Elein

Some people go to great lengths to avoid GROUP BY and HAVING clauses in their queries. The error messages are fussy but they are usually right. GROUP BY and HAVING key words are essential for good SQL reporting.

The primary reason for GROUP BY is to reduce the number of rows, usually by aggregation. It produces only one row for each matching grouping from the input. This allows you to make sophisticated calculations via ordinary SQL.

Fruit Example:

We have some fruit:

 item    | source  | amt | fresh_until
---------+---------+-----+------------
 bananas | Chile   | 50  | 2019-05-01
 bananas | Bolivia | 25  | 2019-04-15
 bananas | Chile   | 150 | 2019-07-10
 apples  | USA-WA  | 75  | 2019-07-01
 apples  | USA-CA  | 75  | 2019-08-15
 apples  | Canada  | 80  | 2019-08-01
 grapes | USA-CA   | 120 | 2019-07-15
(7 rows)

This next case allows us to look forward.  Mid-year, what fruits will be available? We do this with the same query as above, however, after the query runs we check the values of min(fresh_until) by using a having clause. HAVING is how you qualify an aggregate.

select item, count(source) as srcs, sum(amt) as tot_amt,
 min(fresh_until) as min_fresh_until
 from fruit
 group by item
 having min(fresh_until) > '30-jun-2019';

   item  | srcs | tot_amt | min_fresh_until
 --------+------+---------+----------------
  grapes | 1    | 120     | 2019-07-15
  apples | 3    | 230     | 2019-07-01
 (2 rows)

All of the apples and grapes will be available mid-year.

The list of items between SELECT and FROM, the target list. may contain non-aggregates and aggregates. Those non-aggregate columns in the target list
should be in the group by clause. The error message says so. The order of the columns in the group by clause matters. It determines how the aggregates are grouped. The order is often hierarchical. What that means to your columns is your focus. It could be fruit, or sources and/or fresh_until date.

Playing Cards Examples

Let’s look at another set of examples that illustrate extracting information on playing cards. You can learn about cards on Wikipedia Standard Cards.

Suppose you deal out programmatically six 5-card hands, like six people playing poker.  A total of 30 cards are used in this deal. They are in a hand table like the following where the names of cards and suits are joined in by lookup tables. We store ranks so we can sort properly. We use names for display. The names and ranks have a one to one relationship for each of cards and suits.

create or replace view hands_v as
 select handid, su_name, ca_name
 from hands
 join suits using (su_rank)
 join cards using (ca_rank);

  handid | su_name | ca_name
 --------+---------+---------
       1 | Diamond | 2
       1 | Club    | 8
       1 | Spade   | J
       1 | Diamond | 5
       1 | Heart   | Q
       2 | Heart   | 9
       2 | Spade   | 4
       2 | Heart   | 7
       ...

What is the suit count for each hand? We really only care about any hands that have 3 or more cards of the same suit.  That will tell us who has better chances for a poker flush.  Note that although GROUP BY would seem to imply ORDER BY, it does not.  ORDER BY must be explicit.

select handid, su_name, count(ca_name)
 from hands_v
 group by handid, su_name
 having count(ca_name) >= 3
 order by handid, su_name;

  handid | su_name | count
 --------+---------+-------
       3 | Diamond | 3
       4 | Spade   | 3
       6 | Spade   | 3
 (3 rows)

So what if you mis-grouped your query? If this hand table is not grouped by handid, then you will get 30 records of 6 hands of 5-cards. If you had aggregates, they would be grouped by row. Not very helpful.

If you aggregate the card name and do not include
the card name solo on the target list and try to order by card name,
you will receive the error message that it should not be in
the order by clause. The order by clause should contain
elements of the group by clause.
However, if the card name is explicitly in the target list,

 select handid, ca_name, count(ca_name)...

then the card name must be in the group by clause and
therefore allowable on the order by clause.

If the query is by suit, there will be a minimum of 1 or maximum of 4 records per suit for each of the six hands. Notice that we are sorting by suit rank which
also must be in the group by clause. su_name and su_rank have a one to one relationship.

select handid, su_name, count(su_name)
from hands
join suits using (su_rank)
group by su_rank, su_name, handid
order by handid, su_rank;

 handid | su_name | count
--------+---------+-------
      1 | Spade   | 1
      1 | Heart   | 1
      1 | Diamond | 2
      1 | Club    | 1
      2 | Spade   | 1
      2 | Heart   | 2
      2 | Club    | 2
      3 | Spade   | 1
      3 | Diamond | 3
      3 | Club    | 1
      4 | Spade   | 3
      4 | Heart   | 1
      4 | Diamond | 1
      5 | Heart   | 2
      5 | Diamond | 1
      5 | Club    | 2
      6 | Spade   | 3
      6 | Diamond | 2
(18 rows)

To see the distribution of cards into hands, We must group by the card rank column.  Of course there are 4 suits of each card, so you won’t see a card in more than four hands.

select ca_name, count(handid) as num_hands from hands
join suits using (su_rank)
join cards using (ca_rank)
group by ca_rank, ca_name order by ca_rank;

 ca_name | num_hands
---------+-----------
       2 | 2
       3 | 2
       4 | 4
       5 | 3
       6 | 2
       7 | 3   
       8 | 2
       9 | 3
       J | 1
       Q | 2
       K | 4
       A | 2
(12 rows)

To peek and see who is holding aces, we can use the following short query.  Note that there is a WHERE clause which is executed while collecting the rows.  HAVING is executed after the rows are collected.

select handid, count(ca_name)
from hands_v
where ca_name = 'A'
group by handid;
 handid | count
--------+-------
      5 | 1
      6 | 1
(2 rows)
Summary

These examples are simple ways to evaluate known entities. Experiment and use these simple rules.

  • If a column is on the target list and not an aggregate, it must be in a GROUP BY clause.
  • WHERE clauses occur during the selection process.
  • HAVING clauses occur after the aggregates are completed.
  • Only non-aggregates can be in the ORDER BY clause.
  • Order of the GROUP BY clause matters.
Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Get in touch with us!

Recent Posts

  • Random Data December 3, 2020
  • Webinar: COMMIT Without Fear – The Beauty of CAMO [Follow Up] November 13, 2020
  • Full-text search since PostgreSQL 8.3 November 5, 2020
  • Random numbers November 3, 2020
  • Webinar: Best Practices for Bulk Data Loading in PostgreSQL [Follow Up] November 2, 2020

Featured External Blogs

Tomas Vondra's Blog

Our Bloggers

  • Simon Riggs
  • Alvaro Herrera
  • Andrew Dunstan
  • Craig Ringer
  • Francesco Canovai
  • Gabriele Bartolini
  • Giulio Calacoci
  • Ian Barwick
  • Marco Nenciarini
  • Mark Wong
  • Pavan Deolasee
  • Petr Jelinek
  • Shaun Thomas
  • Tomas Vondra
  • Umair Shahid

PostgreSQL Cloud

2QLovesPG 2UDA 9.6 backup Barman BDR Business Continuity community conference database DBA development devops disaster recovery greenplum Hot Standby JSON JSONB logical replication monitoring OmniDB open source Orange performance PG12 pgbarman pglogical PG Phriday postgres Postgres-BDR postgres-xl PostgreSQL PostgreSQL 9.6 PostgreSQL10 PostgreSQL11 PostgreSQL 11 PostgreSQL 11 New Features postgresql repmgr Recovery replication security sql wal webinar webinars

Support & Services

24/7 Production Support

Developer Support

Remote DBA for PostgreSQL

PostgreSQL Database Monitoring

PostgreSQL Health Check

PostgreSQL Performance Tuning

Database Security Audit

Upgrade PostgreSQL

PostgreSQL Migration Assessment

Migrate from Oracle to PostgreSQL

Products

HA Postgres Clusters

Postgres-BDR®

2ndQPostgres

pglogical

repmgr

Barman

Postgres Cloud Manager

SQL Firewall

Postgres-XL

OmniDB

Postgres Installer

2UDA

Postgres Learning Center

Introducing Postgres

Blog

Webinars

Books

Videos

Training

Case Studies

Events

About Us

About 2ndQuadrant

What does 2ndQuadrant Mean?

News

Careers 

Team Profile

© 2ndQuadrant Ltd. All rights reserved. | Privacy Policy
  • Twitter
  • LinkedIn
  • Facebook
  • Youtube
  • Mail
Sequential UUID Generators on SSD Where and when you need a root.crt file
Scroll to top
×