How to Use Automatic Index Recommendations in PostgreSQL

Use this tutorial to learn about using automatic recommendations of indexes for specific queries, all by using just three extensions.

Sahil Aggarwal

Jun. 14, 21 · Tutorial

Likes (6)

Comment

Save

15.0K Views

In our last blog, we learned about the Need and Usage of Hypothetical Indexes in PostgreSQL. We can now "check" easily in a live environment, determine if some particular index will be helpful or not, and figure out how we get to know which index to test. To do this, you'll also need in-depth knowledge of indexing and experience in Postgresql. However, in PostgreSQL, we can get an automatic recommendation of indexes for specific queries by using three extensions hypog, pg_stat_statements, and pg_qualstats.

Let’s explore the practical uses of this feature in Postgres!

I'm experimenting with Postgres10 installed on Centos7.

Installation

Install hypog here.
Install pgqualstats using the following command:

    Java
   
   yum install pg_qualstats10.x86_64

Change the following in PostgreSQL.conf and restart PostgreSQL

    Java
   
   shared_preload_libraries = 'pg_stat_statements, pg_qualstats'

Create the following extensions:

    Java
   
 

   testdb=# CREATE EXTENSION hypopg ;
CREATE EXTENSION
testdb=# CREATE EXTENSION pg_stat_statements ;
CREATE EXTENSION
testdb=# CREATE EXTENSION pg_qualstats;
CREATE EXTENSION
  

Set the sample rate of pgqual stats to 1 in PostgreSQL.conf. This rate defines how frequently to monitor and analyze the queries. Value '1' represents that keep track of all queries:

    Java
   
   pg_qualstats.sample_rate = '1'

Create the function that will be used to detect usable indexes:

    Java
   
 

   CREATE OR REPLACE FUNCTION find_usable_indexes()
RETURNS VOID AS
$$
DECLARE
    l_queries     record;
    l_querytext     text;
    l_idx_def       text;
    l_bef_exp       text;
    l_after_exp     text;
    hypo_idx      record;
    l_attr        record;
    /* l_err       int; */
BEGIN
    CREATE TABLE IF NOT EXISTS public.idx_recommendations (queryid bigint, 
    query text, current_plan jsonb, recmnded_index text, hypo_plan jsonb);
    FOR l_queries IN
    SELECT t.relid, t.relname, t.queryid, t.attnames, t.attnums, 
    pg_qualstats_example_query(t.queryid) as query
      FROM 
        ( 
         SELECT qs.relid::regclass AS relname, qs.relid AS relid, qs.queryid, 
         string_agg(DISTINCT attnames.attnames,',') AS attnames, qs.attnums
         FROM pg_qualstats_all qs
         JOIN pg_qualstats q ON q.queryid = qs.queryid
         JOIN pg_stat_statements ps ON q.queryid = ps.queryid
         JOIN pg_amop amop ON amop.amopopr = qs.opno
         JOIN pg_am ON amop.amopmethod = pg_am.oid,
         LATERAL 
              ( 
               SELECT pg_attribute.attname AS attnames
               FROM pg_attribute
               JOIN unnest(qs.attnums) a(a) ON a.a = pg_attribute.attnum 
               AND pg_attribute.attrelid = qs.relid
               ORDER BY pg_attribute.attnum) attnames,     
         LATERAL unnest(qs.attnums) attnum(attnum)
               WHERE NOT 
               (
                EXISTS 
                      ( 
                       SELECT 1
                       FROM pg_index i
                       WHERE i.indrelid = qs.relid AND 
                       (arraycontains((i.indkey::integer[])[0:array_length(qs.attnums, 1) - 1], 
                        qs.attnums::integer[]) OR arraycontains(qs.attnums::integer[], 
                        (i.indkey::integer[])[0:array_length(i.indkey, 1) + 1]) AND i.indisunique)))
                       GROUP BY qs.relid, qs.queryid, qs.qualnodeid, qs.attnums) t
                       GROUP BY t.relid, t.relname, t.queryid, t.attnames, t.attnums                   
    LOOP
        /* RAISE NOTICE '% : is queryid',l_queries.queryid; */
        execute 'explain (FORMAT JSON) '||l_queries.query INTO l_bef_exp;
        execute 'select hypopg_reset()';
        execute 'SELECT indexrelid,indexname FROM hypopg_create_index(''CREATE INDEX on '||l_queries.relname||'('||l_queries.attnames||')'')' INTO hypo_idx;      
        execute 'explain (FORMAT JSON) '||l_queries.query INTO l_after_exp;
        execute 'select hypopg_get_indexdef('||hypo_idx.indexrelid||')' INTO l_idx_def;
        INSERT INTO public.idx_recommendations (queryid,query,current_plan,recmnded_index,hypo_plan) 
        VALUES (l_queries.queryid,l_querytext,l_bef_exp::jsonb,l_idx_def,l_after_exp::jsonb);        
    END LOOP;    
        execute 'select hypopg_reset()';
END;
$$ LANGUAGE plpgsql;
  

Use

Now, Let’s make a table with 10 Crores rows as follows:

    Java
   
   testdb=# create table orders as select s as orderno, md5(random()::text) as orderitem , now() as order_created from generate_Series(1,100000000) s;
SELECT 100000000

Next, let's run a query on which we want to check if need to make an index:

    Java
   
 

   testdb=# select * from orders where orderno = 80000  ;
-[ RECORD 1 ]-+---------------------------------
orderno       | 80000
orderitem     | 03b41c2f32d99e9a597010608946c4c6
order_created | 2021-05-22 17:52:21.435936+05:30
  

Now, run the following queries to find out which indexes are recommended by this extension and what the improvement percentage is by applying these indexes hypothetically:

    Java
   
   testdb=#           select find_usable_indexes();
 find_usable_indexes 
---------------------
 
(1 row)

testdb=# select b.query, a.recmnded_index,round((((a.current_plan->0->'Plan'->>'Total Cost')::numeric-(hypo_plan->0->'Plan'->>'Total Cost')::numeric)*100/(a.current_plan->0->'Plan'->>'Total Cost')::numeric),2) as percent_improvd FROM idx_recommendations a JOIN pg_stat_statements b ON a.queryid = b.queryid WHERE round((((current_plan->0->'Plan'->>'Total Cost')::numeric-(hypo_plan->0->'Plan'->>'Total Cost')::numeric)*100/(current_plan->0->'Plan'->>'Total Cost')::numeric),2) > 0 order by 3 desc ;
                        query                        |                          recmnded_index                          | percent_improvd 
-----------------------------------------------------+------------------------------------------------------------------+-----------------
 select * from orders where orderno = $1             | CREATE INDEX ON public.orders USING btree (orderno)              |          100.00

The above analysis was internally done by creating the indexes hypothetically, not by making real indexes.

Please note here that you do not fully rely on the automatic index recommendation. Yes, we have no doubt it is a very very useful feature, but please also consider logically why these recommended indexes are useful and whether you should really create them or not.

You can read the PostgreSQL Index Tutorial Series for a basic in-depth understanding of indexes in PostgreSQL.

You can now try the explained feature with more complex queries and comment on this article explaining your result with your queries. Let’s experiment and comment your thoughts below.

PostgreSQL Database

Published at DZone with permission of Sahil Aggarwal. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

Trending

How to Use Automatic Index Recommendations in PostgreSQL

Use this tutorial to learn about using automatic recommendations of indexes for specific queries, all by using just three extensions.

Installation

Use

Related

Partner Resources