DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Building Scalable Real-Time Apps with AstraDB and Vaadin
Register Now

Trending

  • Security Challenges for Microservice Applications in Multi-Cloud Environments
  • Working on an Unfamiliar Codebase
  • Building and Deploying Microservices With Spring Boot and Docker
  • Auto-Scaling Kinesis Data Streams Applications on Kubernetes
  1. DZone
  2. Data Engineering
  3. Databases
  4. How to Find Duplicate Indexes

How to Find Duplicate Indexes

G. Ryan Spain user avatar by
G. Ryan Spain
·
Sep. 26, 14 · Interview
Like (0)
Save
Tweet
Share
6.64K Views

Join the DZone community and get the full member experience.

Join For Free

Josh Berkus at Database Soup wrote a post recently explaining how to find duplicate indexes in SQL. He has written a few queries to help those who need to filter through large amounts of indexes within a database. Berkus says his queries help to minimize the number of false positives that occur.

His first query, which follows, searches for and selects only exact index matches:

-- check for exact matches
WITH index_cols_ord as (
SELECT attrelid, attnum, attname
FROM pg_attribute
JOIN pg_index ON indexrelid = attrelid
WHERE indkey[0] > 0
ORDER BY attrelid, attnum
),
index_col_list AS (
SELECT attrelid,
array_agg(attname) as cols
FROM index_cols_ord
GROUP BY attrelid
),
dup_natts AS (
SELECT indrelid, indexrelid
FROM pg_index as ind
WHERE EXISTS ( SELECT 1
FROM pg_index as ind2
WHERE ind.indrelid = ind2.indrelid
AND ind.indkey = ind2.indkey
AND ind.indexrelid <> ind2.indexrelid
) )
SELECT userdex.schemaname as schema_name,
userdex.relname as table_name,
userdex.indexrelname as index_name,
array_to_string(cols, ', ') as index_cols,
indexdef,
idx_scan as index_scans
FROM pg_stat_user_indexes as userdex
JOIN index_col_list ON index_col_list.attrelid = userdex.indexrelid
JOIN dup_natts ON userdex.indexrelid = dup_natts.indexrelid
JOIN pg_indexes ON userdex.schemaname = pg_indexes.schemaname
AND userdex.indexrelname = pg_indexes.indexname
ORDER BY userdex.schemaname, userdex.relname, cols, userdex.indexrelname;

He explains that, while some false positives appear, these results are still useful, as they help eliminate indexes that are practically the same in any useable sense.

His next query looks very much like the first, but this time searches for indexes that contain the contents of another index AND some extra information. While not all of these indexes are duplicates, he finds that many of them are:

-- check for containment
-- i.e. index A contains index B
-- and both share the same first column
-- but they are NOT identical
 
WITH index_cols_ord as (
SELECT attrelid, attnum, attname
FROM pg_attribute
JOIN pg_index ON indexrelid = attrelid
WHERE indkey[0] > 0
ORDER BY attrelid, attnum
),
index_col_list AS (
SELECT attrelid,
array_agg(attname) as cols
FROM index_cols_ord
GROUP BY attrelid
),
dup_natts AS (
SELECT indrelid, indexrelid
FROM pg_index as ind
WHERE EXISTS ( SELECT 1
FROM pg_index as ind2
WHERE ind.indrelid = ind2.indrelid
AND ( ind.indkey @> ind2.indkey
OR ind.indkey <@ ind2.indkey )
AND ind.indkey[0] = ind2.indkey[0]
AND ind.indkey <> ind2.indkey
AND ind.indexrelid <> ind2.indexrelid
) )
SELECT userdex.schemaname as schema_name,
userdex.relname as table_name,
userdex.indexrelname as index_name,
array_to_string(cols, ', ') as index_cols,
indexdef,
idx_scan as index_scans
FROM pg_stat_user_indexes as userdex
JOIN index_col_list ON index_col_list.attrelid = userdex.indexrelid
JOIN dup_natts ON userdex.indexrelid = dup_natts.indexrelid
JOIN pg_indexes ON userdex.schemaname = pg_indexes.schemaname
AND userdex.indexrelname = pg_indexes.indexname
ORDER BY userdex.schemaname, userdex.relname, cols, userdex.indexrelname;

Berkus says there are many variations of queries that can be made, depending on what kind of duplicate indexes you are looking for. The piece of code he recommends changing to create variations (like looking for different-ordered columns in indexes) is:

WHERE EXISTS ( SELECT 1
    FROM pg_index as ind2
    WHERE ind.indrelid = ind2.indrelid
    AND ( ind.indkey @> ind2.indkey
     OR ind.indkey <@ ind2.indkey )
    AND ind.indkey[0] = ind2.indkey[0]
    AND ind.indkey <> ind2.indkey
    AND ind.indexrelid <> ind2.indexrelid
)

Berkus goes through a bit more information in his blog post here. Check it out for more useful information.


Database

Opinions expressed by DZone contributors are their own.

Trending

  • Security Challenges for Microservice Applications in Multi-Cloud Environments
  • Working on an Unfamiliar Codebase
  • Building and Deploying Microservices With Spring Boot and Docker
  • Auto-Scaling Kinesis Data Streams Applications on Kubernetes

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: