Over a million developers have joined DZone.

EXPLAIN FORMAT=JSON: order_by_subqueries, group_by_subqueries details on subqueries in ORDER BY and GROUP BY

EXPLAIN FORMAT=JSON provides details on how subqueries in ORDER BY and GROUP BY clauses are optimized. Read on to learn more.

· Database Zone

Build fast, scale big with MongoDB Atlas, a hosted service for the leading NoSQL database. Try it now! Brought to you in partnership with MongoDB.

EXPLAIN FORMAT

Orignially written by Sveta Smirnova

Here is another post in the EXPLAIN FORMAT=JSON is Cool! series! In this post, we’ll discuss how the EXPLAIN FORMAT=JSON provides optimization details for  ORDER BY and GROUP BY operations in conjunction with  order_by_subqueries and group_by_subqueries. 

EXPLAIN FORMAT=JSON can print details on how a subquery in ORDER BY is optimized:

mysql> explain format=json select emp_no, concat(first_name, ' ', last_name) f2 from employees order by (select emp_no limit 1)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "60833.60"
    },
    "ordering_operation": {
      "using_filesort": true,
      "table": {
        "table_name": "employees",
        "access_type": "ALL",
        "rows_examined_per_scan": 299843,
        "rows_produced_per_join": 299843,
        "filtered": "100.00",
        "cost_info": {
          "read_cost": "865.00",
          "eval_cost": "59968.60",
          "prefix_cost": "60833.60",
          "data_read_per_join": "13M"
        },
        "used_columns": [
          "emp_no",
          "first_name",
          "last_name"
        ]
      },
      "order_by_subqueries": [
        {
          "dependent": true,
          "cacheable": false,
          "query_block": {
            "select_id": 2,
            "message": "No tables used"
          }
        }
      ]
    }
  }
}
1 row in set, 2 warnings (0.00 sec)
Note (Code 1276): Field or reference 'employees.employees.emp_no' of SELECT #2 was resolved in SELECT #1
Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,concat(`employees`.`employees`.`first_name`,' ',`employees`.`employees`.`last_name`) AS `f2` from `employees`.`employees` order by (/* select#2 */ select `employees`.`employees`.`emp_no` limit 1)

The above code shows member ordering_operation of query_block  (which includes the  order_by_subqueries array) with information on how the subquery in ORDER BY  was optimized.

This is a simple example. In real life you can have larger subqueries in the  ORDER BY  clause. For example, take this more complicated and slightly crazy query:

select emp_no, concat(first_name, ' ', last_name) f2 from employees order by (select dept_no as c from salaries join dept_emp using (emp_no) group by dept_no)

Run a regular EXPLAIN on it. If we imagine this is a regular subquery, we won’t know if it can be cached or would be executed for each row sorted.

mysql> explain  select emp_no, concat(first_name, ' ', last_name) f2 from employees order by (select dept_no as c from salaries join dept_emp using (emp_no) group by dept_no)G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: employees
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 299843
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 2
  select_type: SUBQUERY
        table: dept_emp
   partitions: NULL
         type: index
possible_keys: PRIMARY,emp_no,dept_no
          key: dept_no
      key_len: 4
          ref: NULL
         rows: 331215
     filtered: 100.00
        Extra: Using index
*************************** 3. row ***************************
           id: 2
  select_type: SUBQUERY
        table: salaries
   partitions: NULL
         type: ref
possible_keys: PRIMARY,emp_no
          key: emp_no
      key_len: 4
          ref: employees.dept_emp.emp_no
         rows: 10
     filtered: 100.00
        Extra: Using index
3 rows in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,concat(`employees`.`employees`.`first_name`,' ',`employees`.`employees`.`last_name`) AS `f2` from `employees`.`employees` order by (/* select#2 */ select `employees`.`dept_emp`.`dept_no` AS `c` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`)

EXPLAIN FORMAT=JSON  provides a completely different picture:

mysql> explain format=json select emp_no, concat(first_name, ' ', last_name) f2 from employees order by (select dept_no as c from salaries join dept_emp using (emp_no) group by dept_no)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "60833.60"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "employees",
        "access_type": "ALL",
        "rows_examined_per_scan": 299843,
        "rows_produced_per_join": 299843,
        "filtered": "100.00",
        "cost_info": {
          "read_cost": "865.00",
          "eval_cost": "59968.60",
          "prefix_cost": "60833.60",
          "data_read_per_join": "13M"
        },
        "used_columns": [
          "emp_no",
          "first_name",
          "last_name"
        ]
      },
      "optimized_away_subqueries": [
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "1082124.21"
            },
            "grouping_operation": {
              "using_filesort": false,
              "nested_loop": [
                {
                  "table": {
                    "table_name": "dept_emp",
                    "access_type": "index",
                    "possible_keys": [
                      "PRIMARY",
                      "emp_no",
                      "dept_no"
                    ],
                    "key": "dept_no",
                    "used_key_parts": [
                      "dept_no"
                    ],
                    "key_length": "4",
                    "rows_examined_per_scan": 331215,
                    "rows_produced_per_join": 331215,
                    "filtered": "100.00",
                    "using_index": true,
                    "cost_info": {
                      "read_cost": "673.00",
                      "eval_cost": "66243.00",
                      "prefix_cost": "66916.00",
                      "data_read_per_join": "5M"
                    },
                    "used_columns": [
                      "emp_no",
                      "dept_no"
                    ]
                  }
                },
                {
                  "table": {
                    "table_name": "salaries",
                    "access_type": "ref",
                    "possible_keys": [
                      "PRIMARY",
                      "emp_no"
                    ],
                    "key": "emp_no",
                    "used_key_parts": [
                      "emp_no"
                    ],
                    "key_length": "4",
                    "ref": [
                      "employees.dept_emp.emp_no"
                    ],
                    "rows_examined_per_scan": 10,
                    "rows_produced_per_join": 3399374,
                    "filtered": "100.00",
                    "using_index": true,
                    "cost_info": {
                      "read_cost": "335333.33",
                      "eval_cost": "679874.87",
                      "prefix_cost": "1082124.21",
                      "data_read_per_join": "51M"
                    },
                    "used_columns": [
                      "emp_no",
                      "from_date"
                    ]
                  }
                }
              ]
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,concat(`employees`.`employees`.`first_name`,' ',`employees`.`employees`.`last_name`) AS `f2` from `employees`.`employees` order by (/* select#2 */ select `employees`.`dept_emp`.`dept_no` AS `c` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`)

We see that the subquery was optimized away: member optimized_away_subqueries exists, but there is no order_by_subqueries in the ordering_operation object. We can also see that the subquery was cached: "cacheable": true.

EXPLAIN FORMAT=JSON also provides information about subqueries in the GROUP BY clause. It uses the group_by_subqueries array in the grouping_operation member for this purpose.

mysql> explain format=json select count(emp_no) from salaries group by salary > ALL (select s/c as avg_salary from (select dept_no, sum(salary) as s, count(emp_no) as c from salaries join dept_emp using (emp_no) group by dept_no) t)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "3412037.60"
    },
    "grouping_operation": {
      "using_temporary_table": true,
      "using_filesort": true,
      "cost_info": {
        "sort_cost": "2838638.00"
      },
      "table": {
        "table_name": "salaries",
        "access_type": "ALL",
        "rows_examined_per_scan": 2838638,
        "rows_produced_per_join": 2838638,
        "filtered": "100.00",
        "cost_info": {
          "read_cost": "5672.00",
          "eval_cost": "567727.60",
          "prefix_cost": "573399.60",
          "data_read_per_join": "43M"
        },
        "used_columns": [
          "emp_no",
          "salary",
          "from_date"
        ]
      },
      "group_by_subqueries": [
        {
          "dependent": true,
          "cacheable": false,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "881731.00"
            },
            "table": {
              "table_name": "t",
              "access_type": "ALL",
              "rows_examined_per_scan": 3526884,
              "rows_produced_per_join": 3526884,
              "filtered": "100.00",
              "cost_info": {
                "read_cost": "176354.20",
                "eval_cost": "705376.80",
                "prefix_cost": "881731.00",
                "data_read_per_join": "134M"
              },
              "used_columns": [
                "dept_no",
                "s",
                "c"
              ],
              "attached_condition": "((<cache>(`employees`.`salaries`.`salary`) <= (`t`.`s` / `t`.`c`)) or isnull((`t`.`s` / `t`.`c`)))",
              "materialized_from_subquery": {
                "using_temporary_table": true,
                "dependent": false,
                "cacheable": true,
                "query_block": {
                  "select_id": 3,
                  "cost_info": {
                    "query_cost": "1106758.94"
                  },
                  "grouping_operation": {
                    "using_filesort": false,
                    "nested_loop": [
                      {
                        "table": {
                          "table_name": "dept_emp",
                          "access_type": "index",
                          "possible_keys": [
                            "PRIMARY",
                            "emp_no",
                            "dept_no"
                          ],
                          "key": "dept_no",
                          "used_key_parts": [
                            "dept_no"
                          ],
                          "key_length": "4",
                          "rows_examined_per_scan": 331215,
                          "rows_produced_per_join": 331215,
                          "filtered": "100.00",
                          "using_index": true,
                          "cost_info": {
                            "read_cost": "673.00",
                            "eval_cost": "66243.00",
                            "prefix_cost": "66916.00",
                            "data_read_per_join": "5M"
                          },
                          "used_columns": [
                            "emp_no",
                            "dept_no"
                          ]
                        }
                      },
                      {
                        "table": {
                          "table_name": "salaries",
                          "access_type": "ref",
                          "possible_keys": [
                            "PRIMARY",
                            "emp_no"
                          ],
                          "key": "PRIMARY",
                          "used_key_parts": [
                            "emp_no"
                          ],
                          "key_length": "4",
                          "ref": [
                            "employees.dept_emp.emp_no"
                          ],
                          "rows_examined_per_scan": 10,
                          "rows_produced_per_join": 3526884,
                          "filtered": "100.00",
                          "cost_info": {
                            "read_cost": "334466.14",
                            "eval_cost": "705376.80",
                            "prefix_cost": "1106758.95",
                            "data_read_per_join": "53M"
                          },
                          "used_columns": [
                            "emp_no",
                            "salary",
                            "from_date"
                          ]
                        }
                      }
                    ]
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.01 sec)

Note (Code 1003): /* select#1 */ select count(`employees`.`salaries`.`emp_no`) AS `count(emp_no)` from `employees`.`salaries` group by <not>(<in_optimizer>(`employees`.`salaries`.`salary`,<exists>(/* select#2 */ select 1 from (/* select#3 */ select `employees`.`dept_emp`.`dept_no` AS `dept_no`,sum(`employees`.`salaries`.`salary`) AS `s`,count(`employees`.`salaries`.`emp_no`) AS `c` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`) `t` where ((<cache>(`employees`.`salaries`.`salary`) <= (`t`.`s` / `t`.`c`)) or isnull((`t`.`s` / `t`.`c`))) having <is_not_null_test>((`t`.`s` / `t`.`c`)))))

Again, this output gives a clear view of query optimization: subquery in GROUP BY itself cannot be optimized, cached, or converted into temporary table, but the subquery inside the subquery [select dept_no, sum(salary) as s, count(emp_no) as c from salaries join dept_emp using (emp_no) group by dept_no] could be materialized into a temporary table and cached.

A regular EXPLAIN command does not provide such details:

mysql> explain select count(emp_no) from salaries group by salary > ALL (select s/c as avg_salary from (select dept_no, sum(salary) as s, count(emp_no) as c from salaries join dept_emp using (emp_no) group by dept_no) t)G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838638
     filtered: 100.00
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: <derived3>
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3526884
     filtered: 100.00
        Extra: Using where
*************************** 3. row ***************************
           id: 3
  select_type: DERIVED
        table: dept_emp
   partitions: NULL
         type: index
possible_keys: PRIMARY,emp_no,dept_no
          key: dept_no
      key_len: 4
          ref: NULL
         rows: 331215
     filtered: 100.00
        Extra: Using index
*************************** 4. row ***************************
           id: 3
  select_type: DERIVED
        table: salaries
   partitions: NULL
         type: ref
possible_keys: PRIMARY,emp_no
          key: PRIMARY
      key_len: 4
          ref: employees.dept_emp.emp_no
         rows: 10
     filtered: 100.00
        Extra: NULL
4 rows in set, 1 warning (0.01 sec)
Note (Code 1003): /* select#1 */ select count(`employees`.`salaries`.`emp_no`) AS `count(emp_no)` from `employees`.`salaries` group by <not>(<in_optimizer>(`employees`.`salaries`.`salary`,<exists>(/* select#2 */ select 1 from (/* select#3 */ select `employees`.`dept_emp`.`dept_no` AS `dept_no`,sum(`employees`.`salaries`.`salary`) AS `s`,count(`employees`.`salaries`.`emp_no`) AS `c` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`) `t` where ((<cache>(`employees`.`salaries`.`salary`) <= (`t`.`s` / `t`.`c`)) or isnull((`t`.`s` / `t`.`c`))) having <is_not_null_test>((`t`.`s` / `t`.`c`)))))

Most importantly, we cannot guess from the output if the DERIVED subquery can be cached.

Conlcusion: EXPLAIN FORMAT=JSON provides details on how subqueries in ORDER BY and GROUP BY clauses are optimized.

Now it's easier than ever to get started with MongoDB, the database that allows startups and enterprises alike to rapidly build planet-scale apps. Introducing MongoDB Atlas, the official hosted service for the database on AWS. Try it now! Brought to you in partnership with MongoDB.

Topics:
sql ,json ,mysql

Published at DZone with permission of Peter Zaitsev, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}