How to Find Duplicate Indexes
Join the DZone community and get the full member experience.
Join For FreeJosh Berkus at Database Soup wrote a post recently explaining how to find duplicate indexes in SQL. He has written a few queries to help those who need to filter through large amounts of indexes within a database. Berkus says his queries help to minimize the number of false positives that occur.
His first query, which follows, searches for and selects only exact index matches:
-- check for exact matches WITH index_cols_ord as ( SELECT attrelid, attnum, attname FROM pg_attribute JOIN pg_index ON indexrelid = attrelid WHERE indkey[0] > 0 ORDER BY attrelid, attnum ), index_col_list AS ( SELECT attrelid, array_agg(attname) as cols FROM index_cols_ord GROUP BY attrelid ), dup_natts AS ( SELECT indrelid, indexrelid FROM pg_index as ind WHERE EXISTS ( SELECT 1 FROM pg_index as ind2 WHERE ind.indrelid = ind2.indrelid AND ind.indkey = ind2.indkey AND ind.indexrelid <> ind2.indexrelid ) ) SELECT userdex.schemaname as schema_name, userdex.relname as table_name, userdex.indexrelname as index_name, array_to_string(cols, ', ') as index_cols, indexdef, idx_scan as index_scans FROM pg_stat_user_indexes as userdex JOIN index_col_list ON index_col_list.attrelid = userdex.indexrelid JOIN dup_natts ON userdex.indexrelid = dup_natts.indexrelid JOIN pg_indexes ON userdex.schemaname = pg_indexes.schemaname AND userdex.indexrelname = pg_indexes.indexname ORDER BY userdex.schemaname, userdex.relname, cols, userdex.indexrelname;
He explains that, while some false positives appear, these results are still useful, as they help eliminate indexes that are practically the same in any useable sense.
His next query looks very much like the first, but this time searches for indexes that contain the contents of another index AND some extra information. While not all of these indexes are duplicates, he finds that many of them are:
-- check for containment -- i.e. index A contains index B -- and both share the same first column -- but they are NOT identical WITH index_cols_ord as ( SELECT attrelid, attnum, attname FROM pg_attribute JOIN pg_index ON indexrelid = attrelid WHERE indkey[0] > 0 ORDER BY attrelid, attnum ), index_col_list AS ( SELECT attrelid, array_agg(attname) as cols FROM index_cols_ord GROUP BY attrelid ), dup_natts AS ( SELECT indrelid, indexrelid FROM pg_index as ind WHERE EXISTS ( SELECT 1 FROM pg_index as ind2 WHERE ind.indrelid = ind2.indrelid AND ( ind.indkey @> ind2.indkey OR ind.indkey <@ ind2.indkey ) AND ind.indkey[0] = ind2.indkey[0] AND ind.indkey <> ind2.indkey AND ind.indexrelid <> ind2.indexrelid ) ) SELECT userdex.schemaname as schema_name, userdex.relname as table_name, userdex.indexrelname as index_name, array_to_string(cols, ', ') as index_cols, indexdef, idx_scan as index_scans FROM pg_stat_user_indexes as userdex JOIN index_col_list ON index_col_list.attrelid = userdex.indexrelid JOIN dup_natts ON userdex.indexrelid = dup_natts.indexrelid JOIN pg_indexes ON userdex.schemaname = pg_indexes.schemaname AND userdex.indexrelname = pg_indexes.indexname ORDER BY userdex.schemaname, userdex.relname, cols, userdex.indexrelname;
Berkus says there are many variations of queries that can be made, depending on what kind of duplicate indexes you are looking for. The piece of code he recommends changing to create variations (like looking for different-ordered columns in indexes) is:
WHERE EXISTS ( SELECT 1 FROM pg_index as ind2 WHERE ind.indrelid = ind2.indrelid AND ( ind.indkey @> ind2.indkey OR ind.indkey <@ ind2.indkey ) AND ind.indkey[0] = ind2.indkey[0] AND ind.indkey <> ind2.indkey AND ind.indexrelid <> ind2.indexrelid )
Berkus goes through a bit more information in his blog post here. Check it out for more useful information.
Opinions expressed by DZone contributors are their own.
Trending
-
Security Challenges for Microservice Applications in Multi-Cloud Environments
-
Working on an Unfamiliar Codebase
-
Building and Deploying Microservices With Spring Boot and Docker
-
Auto-Scaling Kinesis Data Streams Applications on Kubernetes
Comments