Over a million developers have joined DZone.

The Power of MySQL’s GROUP_CONCAT

· Performance Zone

See Gartner’s latest research on the application performance monitoring landscape and how APM suites are becoming more and more critical to the business, brought to you in partnership with AppDynamics.

This post comes from at the MySQL Performance Blog.

In the very early days of Percona Vadim wrote very nice post about GROUP_CONCAT.

But I want to show you a bit more about it.

When is GROUP_CONCAT useful? Usually while working with Support customers I recommend it when you have aggregation of many-to-many info. It makes the view simpler and more beautiful and it doesn’t need much effort to make it work.

Some simple examples:

This is a test table:

CREATE TABLE `group_c` (
`parent_id` int(11) DEFAULT NULL,
`child_id` int(11) DEFAULT NULL
) ENGINE=InnoDB;
INSERT INTO group_c(parent_id, child_id)
VALUES (1,1),(1,2),(1,3),(2,1),(2,4),(1,4),(2,6),(3,1),(3,2),(4,1),(4,1),(1,1),(5,0);

Without grouping info the only way you can check things is:

mysql> SELECT DISTINCT
    -> parent_id, child_id
    -> FROM group_c
    -> ORDER BY parent_id;
+-----------+----------+
| parent_id | child_id |
+-----------+----------+
|         1 |        1 |
|         1 |        2 |
|         1 |        3 |
|         1 |        4 |
|         2 |        1 |
|         2 |        3 |
|         2 |        4 |
|         2 |        6 |
|         3 |        1 |
|         3 |        2 |
|         4 |        1 |
|         5 |        0 |
+-----------+----------+
12 rows in set (0.00 sec)

But it looks much better and easier to read with GROUP_CONCAT:

mysql> SELECT DISTINCT
    -> parent_id, GROUP_CONCAT(DISTINCT child_id ORDER BY child_id) AS child_id_list
    -> FROM group_c
    -> group by parent_id
    -> ORDER BY parent_id;
+-----------+---------------+
| parent_id | child_id_list |
+-----------+---------------+
|         1 | 1,2,3,4       |
|         2 | 1,3,4,6       |
|         3 | 1,2           |
|         4 | 1             |
|         5 | 0             |
+-----------+---------------+
5 rows in set (0.00 sec)

Easy? Let’s go to production usage and some “real” examples :)

Assume you have 4 Support Engineers who were working with 6 Customers this week on 15 issues.

As it usually happens: everyone (sure, except those who are on vacation :) ) worked on everything with everybody.

How you would represent it?

Here is my way:

Create test tables:

  • engineers (id, name, surname, URL) – list of engineers
  • customers (id, company name, URL) – list of customers
  • issues (id, customer_id, description) – list of issues assigned to customers
  • workflow (id, engineer_id, issue_id) – list of actions: issues and engineers who worked on them
    -- Engineers
    CREATE TABLE engineers (
    id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
    e_name VARCHAR(30) NOT NULL,
    e_surname VARCHAR(30) NOT NULL,
    url VARCHAR(255) NOT NULL,
    PRIMARY KEY (id)
    ) ENGINE=InnoDB;
    -- Customers
    CREATE TABLE customers (
    id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
    company_name VARCHAR(30) NOT NULL,
    url VARCHAR(255) NOT NULL,
    PRIMARY KEY (id)
    ) ENGINE=InnoDB;
    -- Issues (Issue-Customer)
    CREATE TABLE issues (
    id MEDIUMINT UNSIGNED NOT NULL AUTO_INCREMENT,
    customer_id VARCHAR(30) NOT NULL,
    description TEXT,
    PRIMARY KEY (id)
    ) ENGINE=InnoDB;
    -- Workflow (Action: Engineer-Issue(Customer))
    CREATE TABLE workflow (
    action_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
    engineer_id SMALLINT UNSIGNED NOT NULL,
    issue_id SMALLINT UNSIGNED NOT NULL,
    PRIMARY KEY (action_id)
    ) ENGINE=InnoDB;
    INSERT INTO engineers (e_name, e_surname, url)
    VALUES
    ('Miguel', 'Nieto', 'http://www.percona.com/about-us/our-team/miguel-angel-nieto'),
    ('Marcos', 'Albe', 'http://www.percona.com/about-us/our-team/marcos-albe'),
    ('Valerii', 'Kravchuk', 'http://www.percona.com/about-us/our-team/valerii-kravchuk'),
    ('Michael', 'Rikmas', 'http://www.percona.com/about-us/our-team/michael-rikmas');
    INSERT INTO customers (company_name, url)
    VALUES
    ('OT','http://www.ovaistariq.net/'),
    ('PZ','http://www.peterzaitsev.com/'),
    ('VK','http://mysqlentomologist.blogspot.com/'),
    ('FD','http://www.lefred.be/'),
    ('AS','http://mysqlunlimited.blogspot.com/'),
    ('SS','https://www.flamingspork.com/blog/');
    INSERT INTO issues(customer_id, description)
    VALUES
    (1,'Fix replication'),
    (2,'Help with installation of Percona Cluster'),
    (3,'Hardware suggestions'),
    (4,'Error: no space left'),
    (5,'Help with setup daily backup by Xtrabackup'),
    (6,'Poke sales about Support agreement renewal'),
    (4,'Add more accounts for customer'),
    (2,'Create Hot Fix of Bug 1040735'),
    (1,'Query optimisation'),
    (1,'Prepare custom build for Solaris'),
    (2,'explain about Percona Monitoring plugins'),
    (6,'Prepare access for customer servers for future work'),
    (5,'Decribe load balancing for pt-online-schema-change'),
    (4,'Managing deadlocks'),
    (1,'Suggestions about buffer pool size');
    INSERT INTO workflow (engineer_id, issue_id)
    VALUES (1,1),(4,2),(2,3),(1,4),(3,5),(2,6),(3,7),(2,8),(2,9),(1,10),(3,11),(2,12),(2,13),(3,14),(1,15),(1,9),(4,14),(2,9),(1,15),(3,10),(4,2),(2,15),(4,8),(4,4),(3,11),(1,7),(3,7),(1,1),(1,9),(3,4),(4,3),(1,5),(1,7),(1,4),(2,4),(2,5);

Examples:

List of issues for each engineer (GROUP_CONCAT):

mysql> SELECT
    ->  CONCAT (e_name, ' ', e_surname) AS engineer,
    ->  GROUP_CONCAT(DISTINCT issue_id, ' (', c.company_name,')' ORDER BY issue_id SEPARATOR ', ' ) AS 'issue (customer)'
    -> FROM
    ->  workflow w,
    ->  engineers e,
    ->  customers c,
    ->  issues i
    -> WHERE
    ->  w.engineer_id = e.id
    ->  AND w.issue_id = i.id
    ->  AND i.customer_id = c.id
    -> GROUP BY
    ->  e.id
    -> ORDER BY
    ->  e_name, e_surname;
+------------------+---------------------------------------------------------------------------+
| engineer         | issue (customer)                                                          |
+------------------+---------------------------------------------------------------------------+
| Marcos Albe      | 3 (VK), 4 (FD), 5 (AS), 6 (SS), 8 (PZ), 9 (OT), 12 (SS), 13 (AS), 15 (OT) |
| Michael Rikmas   | 2 (PZ), 3 (VK), 4 (FD), 8 (PZ), 14 (FD)                                   |
| Miguel Nieto     | 1 (OT), 4 (FD), 5 (AS), 7 (FD), 9 (OT), 10 (OT), 15 (OT)                  |
| Valerii Kravchuk | 4 (FD), 5 (AS), 7 (FD), 10 (OT), 11 (PZ), 14 (FD)                         |
+------------------+---------------------------------------------------------------------------+
4 rows in set (0.00 sec)

List of engineers for each customer (GROUP_CONCAT inside of GROUP_CONCAT):

mysql> SELECT
    ->  c.company_name AS company,
    ->  GROUP_CONCAT(DISTINCT issue_id, ' (', engineer_list, ')' ORDER BY issue_id SEPARATOR ', ' ) AS issue
    -> FROM
    ->  workflow w,
    ->  engineers e,
    ->  customers c,
    ->  issues i,
    ->  (SELECT
    ->    i.id AS i_id,
    ->    GROUP_CONCAT(DISTINCT CONCAT(e_name, ' ', e_surname) ORDER BY e_name SEPARATOR ', ') AS engineer_list
    ->   FROM
    ->    workflow w,
    ->    engineers e,
    ->    issues i
    ->   WHERE
    ->    w.engineer_id = e.id
    ->    AND w.issue_id = i.id
    ->   GROUP BY
    ->    i.id) AS e_list
    -> WHERE
    ->  w.engineer_id = e.id
    ->  AND w.issue_id = i.id
    ->  AND i.customer_id = c.id
    ->  AND w.issue_id = e_list.i_id
    -> GROUP BY
    ->  c.id
    -> ORDER BY
    ->  c.company_name;
+---------+--------------------------------------------------------------------------------------------------------------------------------------------+
| company | issue (engineer)                                                                                                                           |
+---------+--------------------------------------------------------------------------------------------------------------------------------------------+
| AS      | 5 (Marcos Albe, Miguel Nieto, Valerii Kravchuk), 13 (Marcos Albe)                                                                          |
| FD      | 4 (Marcos Albe, Michael Rikmas, Miguel Nieto, Valerii Kravchuk), 7 (Miguel Nieto, Valerii Kravchuk), 14 (Michael Rikmas, Valerii Kravchuk) |
| OT      | 1 (Miguel Nieto), 9 (Marcos Albe, Miguel Nieto), 10 (Miguel Nieto, Valerii Kravchuk), 15 (Marcos Albe, Miguel Nieto)                       |
| PZ      | 2 (Michael Rikmas), 8 (Marcos Albe, Michael Rikmas), 11 (Valerii Kravchuk)                                                                 |
| SS      | 6 (Marcos Albe), 12 (Marcos Albe)                                                                                                          |
| VK      | 3 (Marcos Albe, Michael Rikmas)                                                                                                            |
+---------+--------------------------------------------------------------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)

PHP/HTML? Why not? It’s easy :)

Source Code:

<?php
$mysqli = new mysqli("localhost", "db_user", "password", "db_name");
if (mysqli_connect_errno()) {
  printf("Connect failed: %s\n", mysqli_connect_error());
  exit();
}
$query = "SELECT
 i.id,
 GROUP_CONCAT(DISTINCT '<a target=_blank href=', c.url, '>', CONCAT(c.company_name), '</a>' ORDER BY e_name SEPARATOR ', ') AS company,
 i.description,
 GROUP_CONCAT(DISTINCT '<a target=_blank href=', e.url, '>', CONCAT(e_name, ' ', e_surname), '</a>' ORDER BY e_name SEPARATOR ', ') AS engineer_list
FROM
 workflow w,
 engineers e,
 customers c,
 issues i
WHERE
 w.engineer_id = e.id
 AND w.issue_id = i.id
 AND i.customer_id = c.id
GROUP BY
 i.id
ORDER BY
 i.id";
$result = $mysqli->query($query);
while($row = $result->fetch_array())
{
  $rows[] = $row;
}
echo "<table border=1 cellpadding=5 style=\"border-collapse: collapse; border: 1px solid black;\">";
foreach($rows as $row)
{
  echo "<tr>
  		<td>".$row["id"].'</td>
  		<td>'.$row["company"].'</td>
  		<td>'.$row["description"].'</td>
  		<td>'.$row["engineer_list"].'</td>
  	</tr>';
}
echo "</table>";
$result->close();
$mysqli->close();
?>

Result:

1 OT Fix replication Miguel Nieto
2 PZ Help with installation of Percona Cluster Michael Rikmas
3 VK Hardware suggestions Marcos Albe, Michael Rikmas
4 FD Error: no space left Marcos Albe, Michael Rikmas, Miguel Nieto, Valerii Kravchuk
5 AS Help with setup daily backup by Xtrabackup Marcos Albe, Miguel Nieto, Valerii Kravchuk
6 SS Poke sales about Support agreement renewal Marcos Albe
7 FD Add more accounts for customer Miguel Nieto, Valerii Kravchuk
8 PZ Create Hot Fix of Bug 1040735 Marcos Albe, Michael Rikmas
9 OT Query optimisation Marcos Albe, Miguel Nieto
10 OT Prepare custom build for Solaris Miguel Nieto, Valerii Kravchuk
11 PZ explain about Percona Monitoring plugins Valerii Kravchuk
12 SS Prepare access for customer servers for future work Marcos Albe
13 AS Decribe load balancing for pt-online-schema-change Marcos Albe
14 FD Managing deadlocks Michael Rikmas, Valerii Kravchuk
15 OT Suggestions about buffer pool size Marcos Albe, Miguel Nieto

That’s a power of GROUP_CONCAT!



The Performance Zone is brought to you in partnership with AppDynamics.  See Gartner’s latest research on the application performance monitoring landscape and how APM suites are becoming more and more critical to the business.

Topics:

Published at DZone with permission of Peter Zaitsev, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}