DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Master SQL Performance Optimization: Step-by-Step Techniques With Case Studies
  • Enhancing Performance: Optimizing Complex MySQL Queries for Large Datasets
  • The Ultimate Guide To Repair MySQL Database
  • SQL Query Performance Tuning in MySQL

Trending

  • A Scalable Framework for Enterprise Salesforce Optimization: Turning Outcomes Into an Operating System
  • RAG Is Not Enough: Advanced Retrieval Architectures Using Vertex AI Search on GCP
  • Dear Micromanager: Your Distrust Has a Job; It’s Just Not the One You’re Doing
  • Why Your Test Automation Is Always Behind the Code And the Architecture That Fixes It
  1. DZone
  2. Data Engineering
  3. Databases
  4. EXPLAIN FORMAT=JSON: Everything About attached_subqueries, optimized_away_subqueries, and materialized_from_subquery

EXPLAIN FORMAT=JSON: Everything About attached_subqueries, optimized_away_subqueries, and materialized_from_subquery

EXPLAIN FORMAT=JSON provides some great details on subqueries optimization. Read on to learn more.

By 
Peter Zaitsev user avatar
Peter Zaitsev
·
Jan. 03, 16 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
4.7K Views

Join the DZone community and get the full member experience.

Join For Free

Orignially written by Sveta Smirnova

EXPLAIN FORMAT=JSON

The regular EXPLAIN command already provides some information about subquery optimization. For example, you can find out if the subquery is dependent or not, and (since version 5.6) if it was materialized:

mysql> explain select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null)G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: departments
   partitions: NULL
         type: index
possible_keys: PRIMARY
          key: dept_name
      key_len: 42
          ref: NULL
         rows: 9
     filtered: 100.00
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: <subquery2>
   partitions: NULL
         type: eq_ref
possible_keys: <auto_key>
          key: <auto_key>
      key_len: 4
          ref: employees.departments.dept_no
         rows: 1
     filtered: 100.00
        Extra: NULL
*************************** 3. row ***************************
           id: 2
  select_type: MATERIALIZED
        table: dept_manager
   partitions: NULL
         type: ALL
possible_keys: dept_no
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 24
     filtered: 90.00
        Extra: Using where
3 rows in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`departments`.`dept_name` AS `dept_name` from `employees`.`departments` semi join (`employees`.`dept_manager`) where ((`<subquery2>`.`dept_no` = `employees`.`departments`.`dept_no`) and (`employees`.`dept_manager`.`to_date` is not null))

However, you can’t find details on exactly how this subquery was materialized. To find out more, use EXPLAIN FORMAT=JSON:

mysql> explain format=json select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "16.72"
    },
    "nested_loop": [
      {
        "table": {
          "table_name": "departments",
          <skipped>
      },
      {
        "table": {
          "table_name": "<subquery2>",
          "access_type": "eq_ref",
          "key": "<auto_key>",
          "key_length": "4",
          "ref": [
            "employees.departments.dept_no"
          ],
          "rows_examined_per_scan": 1,
          "materialized_from_subquery": {
            "using_temporary_table": true,
            "query_block": {
              "table": {
                "table_name": "dept_manager",
                "access_type": "ALL",
                "possible_keys": [
                  "dept_no"
                ],
                "rows_examined_per_scan": 24,
                "rows_produced_per_join": 21,
                "filtered": "90.00",
                "cost_info": {
                  "read_cost": "1.48",
                  "eval_cost": "4.32",
                  "prefix_cost": "5.80",
                  "data_read_per_join": "345"
                },
                "used_columns": [
                  "dept_no",
                  "to_date"
                ],
                "attached_condition": "(`employees`.`dept_manager`.`to_date` is not null)"
              }
            }
          }
        }
      }
    ]
  }
}
1 row in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`departments`.`dept_name` AS `dept_name` from `employees`.`departments` semi join (`employees`.`dept_manager`) where ((`<subquery2>`.`dept_no` = `employees`.`departments`.`dept_no`) and (`employees`.`dept_manager`.`to_date` is not null))

We can see here that the new temporary table ("using_temporary_table": true) with internal name <subquery2> was materilized from the subquery (materialized_from_subquery). We also see how this new temporary table was accessed.

We also can see that the member that contains information about the materialized subquery is part of the nested_loop array. Nested loop optimization applies to a regular JOIN command. In this case, it means that the MySQL Server would join the temporary table with an outer query.

Another interesting case is optimized_away_subqueries: this command shows subqueries that were executed only once and were replaced by their result. A regular EXPLAIN command provides no information about such optimization:

mysql> explain select emp_no, salary from salaries order by (select max(salary) from salaries)G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838525
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 2
  select_type: SUBQUERY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838525
     filtered: 100.00
        Extra: NULL
2 rows in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`salaries`.`emp_no` AS `emp_no`,`employees`.`salaries`.`salary` AS `salary` from `employees`.`salaries` order by (/* select#2 */ select max(`employees`.`salaries`.`salary`) from `employees`.`salaries`)

We can see that in select_type: SUBQUERY, the output of SHOW WARNINGS doesn’t give us much insight either.

However, EXPLAIN FORMAT=JSON tells us how this subquery was optimized:

mysql> explain format=json select emp_no, salary from salaries order by (select max(salary) from salaries)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "573505.00"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "salaries",

        <skipped>

      "optimized_away_subqueries": [
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "573505.00"
            },
            "table": {
              "table_name": "salaries",
              "access_type": "ALL",
              "rows_examined_per_scan": 2838525,
              "rows_produced_per_join": 2838525,
              "filtered": "100.00",
              "cost_info": {
                "read_cost": "5800.00",
                "eval_cost": "567705.00",
                "prefix_cost": "573505.00",
                "data_read_per_join": "43M"
              },
              "used_columns": [
                "salary"
              ]
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)

After seeing optimized_away_subqueries and "cacheable": true, we know that the subquery was executed only once, and the result was cached and used to resolve the outer query.

Another subquery type that I want to mention in this post is attached_subqueries which covers subqueries that are not converted to JOIN.

mysql> explain format=json select emp_no from salaries where salary > (select avg(salary) from salaries)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "250601.60"
    },
    "table": {
      "table_name": "salaries",
      "access_type": "ALL",
      "rows_examined_per_scan": 1240668,
      "rows_produced_per_join": 413514,
      "filtered": "33.33",
      "cost_info": {
        "read_cost": "167898.67",
        "eval_cost": "82702.92",
        "prefix_cost": "250601.60",
        "data_read_per_join": "6M"
      },
      "used_columns": [
        "emp_no",
        "salary"
      ],
      "attached_condition": "(`employees`.`salaries`.`salary` > (/* select#2 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))",
      "attached_subqueries": [
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "250601.80"
            },
            "table": {
              "table_name": "salaries",
              "access_type": "ALL",
              "rows_examined_per_scan": 1240669,
              "rows_produced_per_join": 1240669,
              "filtered": "100.00",
              "cost_info": {
                "read_cost": "2468.00",
                "eval_cost": "248133.80",
                "prefix_cost": "250601.80",
                "data_read_per_join": "18M"
              },
              "used_columns": [
                "salary"
              ]
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`salaries`.`emp_no` AS `emp_no` from `employees`.`salaries` where (`employees`.`salaries`.`salary` > (/* select#2 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))

For these subqueries, using this option shows if they are dependent and/or cacheable. There is also a query_block member inside the attached_subqueries object which provides all the details on how the query was optimized.

Conclusion: EXPLAIN FORMAT=JSON provides some great details on subqueries optimization.

optimization Database Command (computing) Joins (concurrency library) Insight (email client) POST (HTTP) MySQL Data structure Object (computer science)

Published at DZone with permission of Peter Zaitsev. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Master SQL Performance Optimization: Step-by-Step Techniques With Case Studies
  • Enhancing Performance: Optimizing Complex MySQL Queries for Large Datasets
  • The Ultimate Guide To Repair MySQL Database
  • SQL Query Performance Tuning in MySQL

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook