Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Fun with SQL: Window Functions in Postgres

DZone's Guide to

Fun with SQL: Window Functions in Postgres

These probably aren't the windows you're thinking of. In the world of analytics, these functions are your friend. Read on to find out more.

· Database Zone ·
Free Resource

Running out of memory? Learn how Redis Enterprise enables large dataset analysis with the highest throughput and lowest latency while reducing costs over 75%! 

Today, we continue to explore all the powerful and fun things you can do with SQL. SQL is a very expressive language, and when it comes to analyzing your data, there isn't a better option. You can see the evidence of SQL's power in all the attempts made by NoSQL databases to recreate the capabilities of SQL. So why not just start with a SQL database that scales? (Like my favorites, Postgres and Citus.)

Today, in the latest post in our "Fun with SQL" series (earlier blog posts were about recursive CTEs, generate_series, and relocating shards on a Citus database cluster), we're going to look at window functions in PostgreSQL. Window functions are key in various analytic and reporting use cases where you want to compare and contrast data. Window functions allow you to compare values between rows that are somehow related to the current row. Some practical uses of window functions can be:

  • Finding the first time all users performed some action
  • Finding how much each users' bill increased or decreased from the previous month
  • Find where all users ranked for some sub-grouping

The Basic Structure of a Window Function in Postgres

Window functions within PostgreSQL have a built-in set of operators and perform their action across some specific key, but they can have two different syntaxes that express the same thing. Let's take a look at a simple window function expressed two different ways:

The first format:

SELECT last_name,
       salary,
       department,
       rank() OVER (
        PARTITION BY department
        ORDER BY salary
        DESC)
FROM employees;

The second format:

SELECT last_name,
       salary,
       department,
       rank() over w
FROM employees
       WINDOW w as (partition by department order by salary).

With the first query, we can see the window function is inlined, whereas the second, it is broken out separately. Both of the above queries produce the same results:

last_name  |  salary |  department  | rank
-----------+---------+--------------+-------
Jones      |  45000  |  Accounting  |  1
Williams   |  37000  |  Accounting  |  2
Smith      |  55000  |  Sales       |  1
Adams      |  50000  |  Sales       |  2
Johnson    |  40000  |  Marketing   |  1


Both of these show the last name of employees, their salary, their department — and then rank where they fall in terms of salary in their department. You could easily combine this with a CTE to then find only the highest paying ( where rank = 1) or second highest paying ( where rank = 2) in each department.

What Can You Do With Window Functions in Postgres?

Within Postgres, there are a number of window functions that each perform a different operation. You can check the PostgreSQL docs for the full list, but for now, we'll walk through a few that are particularly interesting:

  • rank — As we saw in the earlier example, rank will show where the row ranks in order of the window order.
  • percent_rank — Want to compute the percent where the row falls within your window order? percent_rank will give you the percentage ranking based on your window think of it as ((rank - 1) / (total rows - 1))
  • lag — Want to do your own operation between rows? Lag will give you the row value xrows before your current row. Want to the value for future rows? You can use lead for that. A great example of this could be computing month over month growth
  • ntile — Want to compute what percentile values fall in? ntile allows you to specify a percentile to group buckets into. For 4 quartiles you would use ntile(4), for percentile of each row you would use ntile(100).

Hopefully, you'll find window functions as useful. If you have questions about using them, the PostgreSQL docs are a great resource. Or, feel free to jump into our Slack channel.

Running out of memory? Never run out of memory with Redis Enterprise databaseStart your free trial today.

Topics:
postgres ,window ,functions ,sql ,analytics ,database

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}