Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Fun With SQL: generate_series in Postgres

DZone's Guide to

Fun With SQL: generate_series in Postgres

Postgres is has a wealth of hidden gems within it. generate_series is one just one of the handy built-in features of Postgres.

· Database Zone ·
Free Resource

Discover Tarantool's unique features which include powerful stored procedures, SQL support, smart cache, and the speed of 1 million ACID transactions on a single CPU core!

There are times within Postgres where you may want to generate sample data or some consistent series of records to join in order for reporting. Enter: the simple but handy set returning function of Postgres: generate_seriesgenerate_series, as the name implies, allows you to generate a set of data starting at some point, ending at another point, and optionally set the incrementing value. generate_series works on two datatypes:

  • Integers
  • Timestamps

Let’s get started with the most basic example:

SELECT * 
FROM generate_series(1, 5);
 generate_series
-----------------
               1
               2
               3
               4
               5
(5 rows)

So generate_series pretty straight-forward, but what interesting ways can it be used?

Generating Fake Data

By putting our generate_series inside a CTE, we can easily now generate a set of numbers and then perform some operation against each value. If we want to generate some fake number, we can use random(), which generates a random number between 0.0 and 1.0.

WITH numbers AS (
  SELECT *
  FROM generate_series(1, 5)
)

SELECT generate_series * random()
FROM numbers;

     ?column?
-------------------
  0.87764338683337
 0.345125129446387
  2.10317854676396
 0.937898803502321
  1.72822773223743
(5 rows)

Pretty Weekly Reporting With Joins

Aggregating across some time dimension is a fairly common report. A good example might be new users per week. The simplest way to get this would be by leveraging the Postgres date_trunc function:

SELECT date_trunc('week', created_at)
       count(*)
FROM users
GROUP BY 1
ORDER BY 1;

The issue with the above query arises when two cases are true: first, you’re charting your data over time and then two you have a week with no sign-ups. In the case of no sign-ups in a week, you’d simply miss the 0 on your graph leaving a misleading impression. To smooth this out, we go back to generate series and do an outer join on the week:

WITH range_values AS (
  SELECT date_trunc('week', min(created_at)) as minval,
         date_trunc('week', max(created_at)) as maxval
  FROM users),

week_range AS (
  SELECT generate_series(minval, maxval, '1 week'::interval) as week
  FROM range_values
),

weekly_counts AS (
  SELECT date_trunc('week', created_at) as week,
         count(*) as ct
  FROM users
  GROUP BY 1
)

SELECT week_range.week,
       weekly_counts.ct
FROM week_range
LEFT OUTER JOIN weekly_counts on week_range.week = weekly_counts.week;

What Other Uses Do You Have for generate_series?

Postgres is has a wealth of hidden gems within it. generate_series is one just one of the handy built-in features of Postgres. If you know of other novel uses for it, we’d love to hear about it @citusdata.

Discover Tarantool's unique features such as powerful stored procedures, SQL support, smart cache, and the speed of 1 million ACID transactions on a single CPU.

Topics:
postgres ,sql ,data ,generate ,series ,test ,function

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}