Over a million developers have joined DZone.

How to Find Duplicate Indexes

· Java Zone

Discover how AppDynamics steps in to upgrade your performance game and prevent your enterprise from these top 10 Java performance problems, brought to you in partnership with AppDynamics.

Josh Berkus at Database Soup wrote a post recently explaining how to find duplicate indexes in SQL. He has written a few queries to help those who need to filter through large amounts of indexes within a database. Berkus says his queries help to minimize the number of false positives that occur.

His first query, which follows, searches for and selects only exact index matches:

-- check for exact matches
WITH index_cols_ord as (
SELECT attrelid, attnum, attname
FROM pg_attribute
JOIN pg_index ON indexrelid = attrelid
WHERE indkey[0] > 0
ORDER BY attrelid, attnum
),
index_col_list AS (
SELECT attrelid,
array_agg(attname) as cols
FROM index_cols_ord
GROUP BY attrelid
),
dup_natts AS (
SELECT indrelid, indexrelid
FROM pg_index as ind
WHERE EXISTS ( SELECT 1
FROM pg_index as ind2
WHERE ind.indrelid = ind2.indrelid
AND ind.indkey = ind2.indkey
AND ind.indexrelid <> ind2.indexrelid
) )
SELECT userdex.schemaname as schema_name,
userdex.relname as table_name,
userdex.indexrelname as index_name,
array_to_string(cols, ', ') as index_cols,
indexdef,
idx_scan as index_scans
FROM pg_stat_user_indexes as userdex
JOIN index_col_list ON index_col_list.attrelid = userdex.indexrelid
JOIN dup_natts ON userdex.indexrelid = dup_natts.indexrelid
JOIN pg_indexes ON userdex.schemaname = pg_indexes.schemaname
AND userdex.indexrelname = pg_indexes.indexname
ORDER BY userdex.schemaname, userdex.relname, cols, userdex.indexrelname;

He explains that, while some false positives appear, these results are still useful, as they help eliminate indexes that are practically the same in any useable sense.

His next query looks very much like the first, but this time searches for indexes that contain the contents of another index AND some extra information. While not all of these indexes are duplicates, he finds that many of them are:

-- check for containment
-- i.e. index A contains index B
-- and both share the same first column
-- but they are NOT identical
 
WITH index_cols_ord as (
SELECT attrelid, attnum, attname
FROM pg_attribute
JOIN pg_index ON indexrelid = attrelid
WHERE indkey[0] > 0
ORDER BY attrelid, attnum
),
index_col_list AS (
SELECT attrelid,
array_agg(attname) as cols
FROM index_cols_ord
GROUP BY attrelid
),
dup_natts AS (
SELECT indrelid, indexrelid
FROM pg_index as ind
WHERE EXISTS ( SELECT 1
FROM pg_index as ind2
WHERE ind.indrelid = ind2.indrelid
AND ( ind.indkey @> ind2.indkey
OR ind.indkey <@ ind2.indkey )
AND ind.indkey[0] = ind2.indkey[0]
AND ind.indkey <> ind2.indkey
AND ind.indexrelid <> ind2.indexrelid
) )
SELECT userdex.schemaname as schema_name,
userdex.relname as table_name,
userdex.indexrelname as index_name,
array_to_string(cols, ', ') as index_cols,
indexdef,
idx_scan as index_scans
FROM pg_stat_user_indexes as userdex
JOIN index_col_list ON index_col_list.attrelid = userdex.indexrelid
JOIN dup_natts ON userdex.indexrelid = dup_natts.indexrelid
JOIN pg_indexes ON userdex.schemaname = pg_indexes.schemaname
AND userdex.indexrelname = pg_indexes.indexname
ORDER BY userdex.schemaname, userdex.relname, cols, userdex.indexrelname;

Berkus says there are many variations of queries that can be made, depending on what kind of duplicate indexes you are looking for. The piece of code he recommends changing to create variations (like looking for different-ordered columns in indexes) is:

WHERE EXISTS ( SELECT 1
    FROM pg_index as ind2
    WHERE ind.indrelid = ind2.indrelid
    AND ( ind.indkey @> ind2.indkey
     OR ind.indkey <@ ind2.indkey )
    AND ind.indkey[0] = ind2.indkey[0]
    AND ind.indkey <> ind2.indkey
    AND ind.indexrelid <> ind2.indexrelid
)

Berkus goes through a bit more information in his blog post here. Check it out for more useful information.


The Java Zone is brought to you in partnership with AppDynamics. AppDynamics helps you gain the fundamentals behind application performance, and implement best practices so you can proactively analyze and act on performance problems as they arise, and more specifically with your Java applications. Start a Free Trial.

Topics:

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}