Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

EXPLAIN FORMAT=JSON: Everything About attached_subqueries, optimized_away_subqueries, and materialized_from_subquery

DZone's Guide to

EXPLAIN FORMAT=JSON: Everything About attached_subqueries, optimized_away_subqueries, and materialized_from_subquery

EXPLAIN FORMAT=JSON provides some great details on subqueries optimization. Read on to learn more.

· Database Zone ·
Free Resource

RavenDB vs MongoDB: Which is Better? This White Paper compares the two leading NoSQL Document Databases on 9 features to find out which is the best solution for your next project.  

Orignially written by Sveta Smirnova

EXPLAIN FORMAT=JSON

The regular EXPLAIN command already provides some information about subquery optimization. For example, you can find out if the subquery is dependent or not, and (since version 5.6) if it was materialized:

mysql> explain select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null)G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: departments
   partitions: NULL
         type: index
possible_keys: PRIMARY
          key: dept_name
      key_len: 42
          ref: NULL
         rows: 9
     filtered: 100.00
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: <subquery2>
   partitions: NULL
         type: eq_ref
possible_keys: <auto_key>
          key: <auto_key>
      key_len: 4
          ref: employees.departments.dept_no
         rows: 1
     filtered: 100.00
        Extra: NULL
*************************** 3. row ***************************
           id: 2
  select_type: MATERIALIZED
        table: dept_manager
   partitions: NULL
         type: ALL
possible_keys: dept_no
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 24
     filtered: 90.00
        Extra: Using where
3 rows in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`departments`.`dept_name` AS `dept_name` from `employees`.`departments` semi join (`employees`.`dept_manager`) where ((`<subquery2>`.`dept_no` = `employees`.`departments`.`dept_no`) and (`employees`.`dept_manager`.`to_date` is not null))

However, you can’t find details on exactly how this subquery was materialized. To find out more, use EXPLAIN FORMAT=JSON:

mysql> explain format=json select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "16.72"
    },
    "nested_loop": [
      {
        "table": {
          "table_name": "departments",
          <skipped>
      },
      {
        "table": {
          "table_name": "<subquery2>",
          "access_type": "eq_ref",
          "key": "<auto_key>",
          "key_length": "4",
          "ref": [
            "employees.departments.dept_no"
          ],
          "rows_examined_per_scan": 1,
          "materialized_from_subquery": {
            "using_temporary_table": true,
            "query_block": {
              "table": {
                "table_name": "dept_manager",
                "access_type": "ALL",
                "possible_keys": [
                  "dept_no"
                ],
                "rows_examined_per_scan": 24,
                "rows_produced_per_join": 21,
                "filtered": "90.00",
                "cost_info": {
                  "read_cost": "1.48",
                  "eval_cost": "4.32",
                  "prefix_cost": "5.80",
                  "data_read_per_join": "345"
                },
                "used_columns": [
                  "dept_no",
                  "to_date"
                ],
                "attached_condition": "(`employees`.`dept_manager`.`to_date` is not null)"
              }
            }
          }
        }
      }
    ]
  }
}
1 row in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`departments`.`dept_name` AS `dept_name` from `employees`.`departments` semi join (`employees`.`dept_manager`) where ((`<subquery2>`.`dept_no` = `employees`.`departments`.`dept_no`) and (`employees`.`dept_manager`.`to_date` is not null))

We can see here that the new temporary table ("using_temporary_table": true) with internal name <subquery2> was materilized from the subquery (materialized_from_subquery). We also see how this new temporary table was accessed.

We also can see that the member that contains information about the materialized subquery is part of the nested_loop array. Nested loop optimization applies to a regular JOIN command. In this case, it means that the MySQL Server would join the temporary table with an outer query.

Another interesting case is optimized_away_subqueries: this command shows subqueries that were executed only once and were replaced by their result. A regular EXPLAIN command provides no information about such optimization:

mysql> explain select emp_no, salary from salaries order by (select max(salary) from salaries)G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838525
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 2
  select_type: SUBQUERY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838525
     filtered: 100.00
        Extra: NULL
2 rows in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`salaries`.`emp_no` AS `emp_no`,`employees`.`salaries`.`salary` AS `salary` from `employees`.`salaries` order by (/* select#2 */ select max(`employees`.`salaries`.`salary`) from `employees`.`salaries`)

We can see that in select_type: SUBQUERY, the output of SHOW WARNINGS doesn’t give us much insight either.

However, EXPLAIN FORMAT=JSON tells us how this subquery was optimized:

mysql> explain format=json select emp_no, salary from salaries order by (select max(salary) from salaries)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "573505.00"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "salaries",

        <skipped>

      "optimized_away_subqueries": [
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "573505.00"
            },
            "table": {
              "table_name": "salaries",
              "access_type": "ALL",
              "rows_examined_per_scan": 2838525,
              "rows_produced_per_join": 2838525,
              "filtered": "100.00",
              "cost_info": {
                "read_cost": "5800.00",
                "eval_cost": "567705.00",
                "prefix_cost": "573505.00",
                "data_read_per_join": "43M"
              },
              "used_columns": [
                "salary"
              ]
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)

After seeing optimized_away_subqueries and "cacheable": true, we know that the subquery was executed only once, and the result was cached and used to resolve the outer query.

Another subquery type that I want to mention in this post is attached_subqueries which covers subqueries that are not converted to JOIN.

mysql> explain format=json select emp_no from salaries where salary > (select avg(salary) from salaries)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "250601.60"
    },
    "table": {
      "table_name": "salaries",
      "access_type": "ALL",
      "rows_examined_per_scan": 1240668,
      "rows_produced_per_join": 413514,
      "filtered": "33.33",
      "cost_info": {
        "read_cost": "167898.67",
        "eval_cost": "82702.92",
        "prefix_cost": "250601.60",
        "data_read_per_join": "6M"
      },
      "used_columns": [
        "emp_no",
        "salary"
      ],
      "attached_condition": "(`employees`.`salaries`.`salary` > (/* select#2 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))",
      "attached_subqueries": [
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "250601.80"
            },
            "table": {
              "table_name": "salaries",
              "access_type": "ALL",
              "rows_examined_per_scan": 1240669,
              "rows_produced_per_join": 1240669,
              "filtered": "100.00",
              "cost_info": {
                "read_cost": "2468.00",
                "eval_cost": "248133.80",
                "prefix_cost": "250601.80",
                "data_read_per_join": "18M"
              },
              "used_columns": [
                "salary"
              ]
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`salaries`.`emp_no` AS `emp_no` from `employees`.`salaries` where (`employees`.`salaries`.`salary` > (/* select#2 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))

For these subqueries, using this option shows if they are dependent and/or cacheable. There is also a query_block member inside the attached_subqueries object which provides all the details on how the query was optimized.

Conclusion: EXPLAIN FORMAT=JSON provides some great details on subqueries optimization.

Get comfortable using NoSQL in a free, self-directed learning course provided by RavenDB. Learn to create fully-functional real-world programs on NoSQL Databases. Register today.

Topics:
percona ,database ,json

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}