Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

How to Find Duplicate Indexes

DZone's Guide to

How to Find Duplicate Indexes

· Java Zone ·
Free Resource

Verify, standardize, and correct the Big 4 + more– name, email, phone and global addresses – try our Data Quality APIs now at Melissa Developer Portal!

Josh Berkus at Database Soup wrote a post recently explaining how to find duplicate indexes in SQL. He has written a few queries to help those who need to filter through large amounts of indexes within a database. Berkus says his queries help to minimize the number of false positives that occur.

His first query, which follows, searches for and selects only exact index matches:

-- check for exact matches
WITH index_cols_ord as (
SELECT attrelid, attnum, attname
FROM pg_attribute
JOIN pg_index ON indexrelid = attrelid
WHERE indkey[0] > 0
ORDER BY attrelid, attnum
),
index_col_list AS (
SELECT attrelid,
array_agg(attname) as cols
FROM index_cols_ord
GROUP BY attrelid
),
dup_natts AS (
SELECT indrelid, indexrelid
FROM pg_index as ind
WHERE EXISTS ( SELECT 1
FROM pg_index as ind2
WHERE ind.indrelid = ind2.indrelid
AND ind.indkey = ind2.indkey
AND ind.indexrelid <> ind2.indexrelid
) )
SELECT userdex.schemaname as schema_name,
userdex.relname as table_name,
userdex.indexrelname as index_name,
array_to_string(cols, ', ') as index_cols,
indexdef,
idx_scan as index_scans
FROM pg_stat_user_indexes as userdex
JOIN index_col_list ON index_col_list.attrelid = userdex.indexrelid
JOIN dup_natts ON userdex.indexrelid = dup_natts.indexrelid
JOIN pg_indexes ON userdex.schemaname = pg_indexes.schemaname
AND userdex.indexrelname = pg_indexes.indexname
ORDER BY userdex.schemaname, userdex.relname, cols, userdex.indexrelname;

He explains that, while some false positives appear, these results are still useful, as they help eliminate indexes that are practically the same in any useable sense.

His next query looks very much like the first, but this time searches for indexes that contain the contents of another index AND some extra information. While not all of these indexes are duplicates, he finds that many of them are:

-- check for containment
-- i.e. index A contains index B
-- and both share the same first column
-- but they are NOT identical
 
WITH index_cols_ord as (
SELECT attrelid, attnum, attname
FROM pg_attribute
JOIN pg_index ON indexrelid = attrelid
WHERE indkey[0] > 0
ORDER BY attrelid, attnum
),
index_col_list AS (
SELECT attrelid,
array_agg(attname) as cols
FROM index_cols_ord
GROUP BY attrelid
),
dup_natts AS (
SELECT indrelid, indexrelid
FROM pg_index as ind
WHERE EXISTS ( SELECT 1
FROM pg_index as ind2
WHERE ind.indrelid = ind2.indrelid
AND ( ind.indkey @> ind2.indkey
OR ind.indkey <@ ind2.indkey )
AND ind.indkey[0] = ind2.indkey[0]
AND ind.indkey <> ind2.indkey
AND ind.indexrelid <> ind2.indexrelid
) )
SELECT userdex.schemaname as schema_name,
userdex.relname as table_name,
userdex.indexrelname as index_name,
array_to_string(cols, ', ') as index_cols,
indexdef,
idx_scan as index_scans
FROM pg_stat_user_indexes as userdex
JOIN index_col_list ON index_col_list.attrelid = userdex.indexrelid
JOIN dup_natts ON userdex.indexrelid = dup_natts.indexrelid
JOIN pg_indexes ON userdex.schemaname = pg_indexes.schemaname
AND userdex.indexrelname = pg_indexes.indexname
ORDER BY userdex.schemaname, userdex.relname, cols, userdex.indexrelname;

Berkus says there are many variations of queries that can be made, depending on what kind of duplicate indexes you are looking for. The piece of code he recommends changing to create variations (like looking for different-ordered columns in indexes) is:

WHERE EXISTS ( SELECT 1
    FROM pg_index as ind2
    WHERE ind.indrelid = ind2.indrelid
    AND ( ind.indkey @> ind2.indkey
     OR ind.indkey <@ ind2.indkey )
    AND ind.indkey[0] = ind2.indkey[0]
    AND ind.indkey <> ind2.indkey
    AND ind.indexrelid <> ind2.indexrelid
)

Berkus goes through a bit more information in his blog post here. Check it out for more useful information.


Developers! Quickly and easily gain access to the tools and information you need! Explore, test and combine our data quality APIs at Melissa Developer Portal – home to tools that save time and boost revenue. 

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}