Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

DZone's Guide to

# Creating a Microsoft Excel-Style Pivot Table With Grand Totals in SQL

### Whenever working with data and SQL, try finding an elegant solution with SQL. There are many tools for a variety of data processing and data presentation use cases.

· Database Zone ·
Free Resource

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

This answer to a beautiful Stack Overflow question I've given recently needs further explanation in a blog post. When working with Microsoft Excel, we can create beautiful and also very insightful pivot tables with grand totals. What are they? This is best explained visually.

Assuming you have this normalized form for your raw data. As in the question, it's an inventory table in a bike shop:

Now, in order to analyze our inventory, we'd love to pivot the normalized representation above to the following non-normalized representation, and we'd also like to display the grand totals to learn how many bikes of each type we have, how many bikes of each color, and how many bikes in total:

There are tons of great tutorials out there explaining how to do this with Microsoft Excel. What we care about is how to do this with SQL...

## How to Do This With SQL

We're using two SQL features for this:

Let's create some data first. I'm going to use SQL Server syntax for most of this blog post. At the end, there will be a full solution for SQL Server, Oracle, and PostgreSQL:

WITH Bikes AS (
SELECT * FROM (
VALUES ('Mountain Bikes', 'Black'),
('Mountain Bikes', 'Black'),
('Mountain Bikes', 'Silver'),
('Touring Bikes', 'Blue'),
('Touring Bikes', 'Blue'),
('Touring Bikes', 'Yellow')
) AS Bikes (Name, Colour)
)
SELECT * FROM Bikes

This simply produces an in-memory table representation of our original, normalized dataset.

Now, the first step is to create the following totals and grand totals:

• Total bikes per name and color
• (Grand) total bikes per name
• (Grand) total bikes per color
• (Grand) total bikes

In this particular case, we can use CUBE(), which forms all the possible GROUPING SETS combinations:

SELECT Name, Colour, COUNT(*) AS Total
FROM Bikes
GROUP BY CUBE (Name, Colour)
ORDER BY Name, Colour

The result looks like this:

Name            Colour  Total
-----------------------------
NULL            NULL    11
NULL            Black   3
NULL            Blue    2
NULL            Red     3
NULL            Silver  1
NULL            Yellow  2
Mountain Bikes  NULL    3
Mountain Bikes  Black   2
Mountain Bikes  Silver  1
Touring Bikes   NULL    3
Touring Bikes   Blue    2
Touring Bikes   Yellow  1

Excellent! All the (grand) totals are now in the result set. Notice that we could have manually written this using the following, much more tedious syntax:

SELECT Name, Colour, COUNT(*) AS Total
FROM Bikes
GROUP BY Name, Colour
UNION ALL
SELECT Name, NULL, COUNT(*) AS Total
FROM Bikes
GROUP BY Name
UNION ALL
SELECT NULL, Colour, COUNT(*) AS Total
FROM Bikes
GROUP BY Colour
UNION ALL
SELECT NULL, NULL, COUNT(*) AS Total
FROM Bikes
ORDER BY Name, Colour

So, CUBE() (and ROLLUP() and GROUPING SETS()) is just syntax sugar for the above more verbose UNION ALL representation, with the additional important difference that very likely you're going to get a much more optimal execution plan using CUBE():

...than using manual UNION ALL:

The result would be similar in Oracle and other databases.

This isn't surprising. We can aggregate all the grand totals in one go with CUBE() (in fact, the "grand grand total" is calculated separately in this case), whereas it's hard for the optimizer to prove that the UNION ALL version is really the same thing and the individual subqueries can be factored out.

Before we move on, just a slight improvement, let's rename the grand totals from NULL to Total and wrap the thing in a derived table T:

SELECT *
FROM (
SELECT
COALESCE(Name, 'Total') AS Name,
COALESCE(Colour, 'Total') AS Colour,
COUNT(*) AS Count
FROM Bikes
GROUP BY CUBE (Name, Colour)
) AS t

## Now, Pivot This Representation to Be More Readable

The data still looks normalized with repeating names and colors in the result tables. Let's pivot it using... wait for it... the PIVOT clause (available in Oracle and SQL Server).

The PIVOT clause is a bit funky. It can be appended to any table expression (including derived tables) to pivot it. It will apply an implicit GROUP BY operation and generate a set of aggregated SELECT columns. When we pivot our previous derived table T:

SELECT *
FROM t
PIVOT (
SUM(Count) FOR Colour IN (
Red, Blue, Black, Silver, Yellow,
Grey, Multi, Uncoloured, Total
)
) AS p

...we get the following, nice-looking result:

Name            Red     Blue    Black   Silver  Yellow  Grey    Multi   Uncoloured  Total
-----------------------------------------------------------------------------------------
Mountain Bikes  NULL    NULL    2       1       NULL    NULL    NULL    NULL        3
Road Bikes      3       NULL    1       NULL    1       NULL    NULL    NULL        5
Touring Bikes   NULL    2       NULL    NULL    1       NULL    NULL    NULL        3
Total           3       2       3       1       2       NULL    NULL    NULL        11

That's almost the desired result — all that's missing is some null handling. How does it work? We have the following syntax:

[ table ] PIVOT (
[ aggregate function(s) ] FOR [ column(s) ] IN ( [ values ] )
)

...where:

• The [ table ] is the table being pivoted.
• The [ column(s) ] are the columns from the [ table ] being grouped, as in any ordinary GROUP BY clause.
• The [ values ] are the values of the [ column(s) ], for which filtered aggregations are made.
• The [ aggregate function(s) ] are the aggregations that are made per [ column(s) ] (group) and per [ value ] (filter).

This syntax is Oracle and SQL Server specific. Oracle can do a bit more than SQL Server. If this syntax is not available in your database, you can write it out manually again (just like the above CUBE() to get this:

SELECT
Name,
SUM(CASE WHEN Colour = 'Red'        THEN Count END) AS Red,
SUM(CASE WHEN Colour = 'Blue'       THEN Count END) AS Blue,
SUM(CASE WHEN Colour = 'Black'      THEN Count END) AS Black,
SUM(CASE WHEN Colour = 'Silver'     THEN Count END) AS Silver,
SUM(CASE WHEN Colour = 'Yellow'     THEN Count END) AS Yellow,
SUM(CASE WHEN Colour = 'Grey'       THEN Count END) AS Grey,
SUM(CASE WHEN Colour = 'Multi'      THEN Count END) AS Multi,
SUM(CASE WHEN Colour = 'Uncoloured' THEN Count END) AS Uncoloured,
SUM(CASE WHEN Colour = 'Total'      THEN Count END) AS Total
FROM t
GROUP BY Name

There should be no performance penalty in the manually written version (although, as always, do check). You can find more details about this in a previous article.

## Putting It All Together

Here's the complete query in SQL Server:

WITH Bikes(Name, Colour) AS (
SELECT * FROM (
VALUES ('Mountain Bikes', 'Black'),
('Mountain Bikes', 'Black'),
('Mountain Bikes', 'Silver'),
('Touring Bikes', 'Blue'),
('Touring Bikes', 'Blue'),
('Touring Bikes', 'Yellow')
) AS Bikes(Name, Colour)
)
SELECT
Name,
COALESCE(Red, 0) AS Red,
COALESCE(Blue, 0) AS Blue,
COALESCE(Black, 0) AS Black,
COALESCE(Silver, 0) AS Silver,
COALESCE(Yellow, 0) AS Yellow,
COALESCE(Grey, 0) AS Grey,
COALESCE(Multi, 0) AS Multi,
COALESCE(Uncoloured, 0) AS Uncoloured,
Total
FROM (
SELECT
COALESCE(Name, 'Total') AS Name,
COALESCE(Colour, 'Total') AS Colour,
COUNT(*) AS Count
FROM Bikes
GROUP BY CUBE (Name, Colour)
) AS t
PIVOT (
SUM(Count) FOR Colour IN (
Red, Blue, Black, Silver, Yellow,
Grey, Multi, Uncoloured, Total
)
) AS p
ORDER BY CASE Name WHEN 'Total' THEN 1 ELSE 0 END, Name

Or Oracle:

WITH Bikes(Name, Colour) AS (
SELECT 'Mountain Bikes', 'Black'  FROM dual UNION ALL
SELECT 'Mountain Bikes', 'Black'  FROM dual UNION ALL
SELECT 'Mountain Bikes', 'Silver' FROM dual UNION ALL
SELECT 'Road Bikes',     'Red'    FROM dual UNION ALL
SELECT 'Road Bikes',     'Red'    FROM dual UNION ALL
SELECT 'Road Bikes',     'Red'    FROM dual UNION ALL
SELECT 'Road Bikes',     'Black'  FROM dual UNION ALL
SELECT 'Road Bikes',     'Yellow' FROM dual UNION ALL
SELECT 'Touring Bikes',  'Blue'   FROM dual UNION ALL
SELECT 'Touring Bikes',  'Blue'   FROM dual UNION ALL
SELECT 'Touring Bikes',  'Yellow' FROM dual
)
SELECT
Name,
COALESCE(Red, 0) AS Red,
COALESCE(Blue, 0) AS Blue,
COALESCE(Black, 0) AS Black,
COALESCE(Silver, 0) AS Silver,
COALESCE(Yellow, 0) AS Yellow,
COALESCE(Grey, 0) AS Grey,
COALESCE(Multi, 0) AS Multi,
COALESCE(Uncoloured, 0) AS Uncoloured,
Total
FROM (
SELECT
COALESCE(Name, 'Total') AS Name,
COALESCE(Colour, 'Total') AS Colour,
COUNT(*) AS Count
FROM Bikes
GROUP BY CUBE (Name, Colour)
) t
PIVOT (
SUM(Count) FOR Colour IN (
'Red' AS Red,
'Blue' AS Blue,
'Black' AS Black,
'Silver' AS Silver,
'Yellow' AS Yellow,
'Grey' AS Grey,
'Multi' AS Multi,
'Uncoloured' AS Uncoloured,
'Total' AS Total
)
) p
ORDER BY CASE Name WHEN 'Total' THEN 1 ELSE 0 END, Name

Or PostgreSQL:

WITH Bikes(Name, Colour) AS (
SELECT * FROM (
VALUES ('Mountain Bikes', 'Black'),
('Mountain Bikes', 'Black'),
('Mountain Bikes', 'Silver'),
('Touring Bikes', 'Blue'),
('Touring Bikes', 'Blue'),
('Touring Bikes', 'Yellow')
) AS Bikes(Name, Colour)
)
SELECT
Name,
COALESCE(SUM(Count) FILTER (WHERE Colour = 'Red'       ), 0) AS Red,
COALESCE(SUM(Count) FILTER (WHERE Colour = 'Blue'      ), 0) AS Blue,
COALESCE(SUM(Count) FILTER (WHERE Colour = 'Black'     ), 0) AS Black,
COALESCE(SUM(Count) FILTER (WHERE Colour = 'Silver'    ), 0) AS Silver,
COALESCE(SUM(Count) FILTER (WHERE Colour = 'Yellow'    ), 0) AS Yellow,
COALESCE(SUM(Count) FILTER (WHERE Colour = 'Grey'      ), 0) AS Grey,
COALESCE(SUM(Count) FILTER (WHERE Colour = 'Multi'     ), 0) AS Multi,
COALESCE(SUM(Count) FILTER (WHERE Colour = 'Uncoloured'), 0) AS Uncoloured,
COALESCE(SUM(Count) FILTER (WHERE Colour = 'Total'     ), 0) AS Total
FROM (
SELECT
COALESCE(Name, 'Total') AS Name,
COALESCE(Colour, 'Total') AS Colour,
COUNT(*) AS Count
FROM Bikes
GROUP BY CUBE (Name, Colour)
) AS t
GROUP BY Name
ORDER BY CASE Name WHEN 'Total' THEN 1 ELSE 0 END, Name

Or MySQL (which doesn't support CUBE, only ROLLUP, thus slightly tweaked PostgreSQL variant):

WITH Bikes(Name, Colour) AS (
SELECT 'Mountain Bikes', 'Black'  UNION ALL
SELECT 'Mountain Bikes', 'Black'  UNION ALL
SELECT 'Mountain Bikes', 'Silver' UNION ALL
SELECT 'Road Bikes',     'Red'    UNION ALL
SELECT 'Road Bikes',     'Red'    UNION ALL
SELECT 'Road Bikes',     'Red'    UNION ALL
SELECT 'Road Bikes',     'Black'  UNION ALL
SELECT 'Road Bikes',     'Yellow' UNION ALL
SELECT 'Touring Bikes',  'Blue'   UNION ALL
SELECT 'Touring Bikes',  'Blue'   UNION ALL
SELECT 'Touring Bikes',  'Yellow'
)
SELECT
Name,
COALESCE(SUM(CASE WHEN Colour = 'Red'        THEN Count END), 0) AS Red,
COALESCE(SUM(CASE WHEN Colour = 'Blue'       THEN Count END), 0) AS Blue,
COALESCE(SUM(CASE WHEN Colour = 'Black'      THEN Count END), 0) AS Black,
COALESCE(SUM(CASE WHEN Colour = 'Silver'     THEN Count END), 0) AS Silver,
COALESCE(SUM(CASE WHEN Colour = 'Yellow'     THEN Count END), 0) AS Yellow,
COALESCE(SUM(CASE WHEN Colour = 'Grey'       THEN Count END), 0) AS Grey,
COALESCE(SUM(CASE WHEN Colour = 'Multi'      THEN Count END), 0) AS Multi,
COALESCE(SUM(CASE WHEN Colour = 'Uncoloured' THEN Count END), 0) AS Uncoloured,
COALESCE(SUM(CASE WHEN Name != 'Total' OR Colour != 'Total' THEN Count END), 0) AS Total
FROM (
SELECT
COALESCE(Name, 'Total') AS Name,
COALESCE(Colour, 'Total') AS Colour,
COUNT(*) AS Count
FROM Bikes
GROUP BY Colour, Name WITH ROLLUP
) AS t
GROUP BY Name
ORDER BY CASE Name WHEN 'Total' THEN 1 ELSE 0 END, Name

## Conclusion

Whenever working with data and SQL, try finding an elegant solution with SQL. There are many tools for a variety of data processing and data presentation use cases.

Read "Developing Apps Using Active-Active Redis Enterprise" and discover the advantages over other active-actve databases.

Topics:
database ,microsoft excel ,pivot tables ,sql ,tutorial ,data processing ,data visualization

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.