DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • The Generic Way To Convert Between Java and PostgreSQL Enums
  • Generate Random Test Data in PostgreSQL
  • Database Connection Pooling at Scale: PgBouncer + Multi-Tenant Postgres (10K Concurrent Connections)
  • Why PostgreSQL Vacuum Matters More Than You Think

Trending

  • Jakarta EE 12: Entering the Data Age of Enterprise Java
  • Alternative Structured Concurrency
  • RAG Is Not Enough: Advanced Retrieval Architectures Using Vertex AI Search on GCP
  • Mocking Kafka for Local Spring Development
  1. DZone
  2. Data Engineering
  3. Databases
  4. What Are Hypothetical Indexes in PostgreSQL?

What Are Hypothetical Indexes in PostgreSQL?

Explore why you might need hypothetical indexes in PostgreSQL and how to use them,

By 
Sahil Aggarwal user avatar
Sahil Aggarwal
·
Jun. 01, 21 · Tutorial
Likes (8)
Comment
Save
Tweet
Share
9.4K Views

Join the DZone community and get the full member experience.

Join For Free

As the name suggests, "hypothetical indexes" are not real indexes. They are virtual indexes that PostgreSQL query planner does not consider when running queries. 

So when and where are these Hypothetical Indexes useful?

First, let’s discuss a scenario in which we have a large table that is currently in the production environment. We need to make some indexes on live DB, and we are not sure whether that index will be useful or not. We don’t even know if by making that index, our production environment may be down!

The solution:

  • Let's ignore the risk and make the index on the live table. Why?
    • It will take lots of time depending on the data present.
    • Live queries may be affected badly if we are not sure if the index we are making will increase or decrease the cost.
    • We also do not know the size of the index may be too large, which can again impact the production database server. 
  • Replicate the production database to the local dev environment and apply all the hit. Then, apply it at the production environment. It seems like a very safe and effective approach in almost all cases, but this will often take too much time in setting up and testing.  
  • Hypothetical Indexes.This functionality will create imaginary indexes, not real indexes. However, there are some things to note about these indexes:
    • It creates an index in our connection’s private memory. None of the catalog tables nor the original tables are actually touched. 
    • The only way to see if we can benefit from that index is by running an EXPLAIN <QUERY>.
    • If you wish to run an EXPLAIN ANALYZE that runs that SQL and provides the run time stats, it would not be able to use that hypothetical index. This is because a hypothetical index does not exist in reality.
    • Currently, these indexes will work on BTREE ONLY. However, you can try if it works on other types of indexes.

Uses of Hypothetical Indexes

Installation 

I am using PostgreSQL10 on CentOS7. 

Download hypopg by the following command: 

Java
 
Wget  https://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7-x86_64/hypopg_10-1.1.4-1.rhel7.x86_64.rpm


Then, install it on CentOS7:

Java
 
yum install hypopg_10-1.1.4-1.rhel7.x86_64.rpm


Now, create extension using the following query: 

Java
 
testdb=# CREATE EXTENSION hypopg ;
CREATE EXTENSION


On Creating extension following functions will be created: 

Java
 
testdb=#  select proname from pg_proc where proname ilike '%hyp%';
       proname        
----------------------
 hypopg_reset_index
 hypopg_reset
 hypopg_create_index
 hypopg_drop_index
 hypopg
 hypopg_list_indexes
 hypopg_relation_size
 hypopg_get_indexdef


Usage 

Now let’s make a table with 10 Crores rows as follows:

Java
 
testdb=# create table orders as select s as orderno, md5(random()::text) as orderitem , now() as order_created from generate_Series(1,100000000) s;
SELECT 100000000


Next, check the COST of a query by running explain:

Java
 
testdb=# explain select * from orders where orderno > 80000  order by order_created desc  limit 100 ;
                                   QUERY PLAN                                   
--------------------------------------------------------------------------------
 Limit  (cost=3600088.98..3600089.23 rows=100 width=44)
   ->  Sort  (cost=3600088.98..3688095.27 rows=35202513 width=44)
         Sort Key: order_created DESC
         ->  Seq Scan on orders  (cost=0.00..2254674.25 rows=35202513 width=44)
               Filter: (orderno > 80000)
(5 rows)


Now, create the Hypothetical Index:

Java
 
testdb=# SELECT * FROM hypopg_create_index('create index order_created_idx on orders(order_created)');
 indexrelid |             indexname             
------------+-----------------------------------
      24797 | <24797>btree_orders_order_created
(1 row)


Next, repeat Explain to check if the above index may be useful or not:

Java
 
testdb=# explain select * from orders where orderno > 80000  order by order_created desc  limit 100 ;
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.07..4.29 rows=100 width=45)
   ->  Index Scan Backward using "<24797>btree_orders_order_created" on orders  (cost=0.07..4215496.19 rows=99917459 width=45)
         Filter: (orderno > 80000)
(3 rows)


Now we can clearly see the difference in cost and can also see that planner is using a newly created hypothetical index.

We can also Drop the index as follows:

Java
 
testdb=# select * from hypopg_drop_index(24797);
 hypopg_drop_index 
-------------------
 t
(1 row)


Then, check the estimated size of the index created virtually as follows:

Java
 
testdb=# SELECT * FROM hypopg_create_index('create index order_created_idx on orders(order_created)');
 indexrelid |             indexname             
------------+-----------------------------------
      24798 | <24798>btree_orders_order_created
(1 row)

testdb=# select * from pg_size_pretty(hypopg_relation_size(24798));
 pg_size_pretty 
----------------
 2990 MB
(1 row)


Now, let's create an actual index and see what will be the actual size:

Java
 
testdb=# create index order_created_idx on orders(order_created);
CREATE INDEX
testdb=# \di+ order_created_idx
                               List of relations
 Schema |       Name        | Type  |  Owner   | Table  |  Size   | Description 
--------+-------------------+-------+----------+--------+---------+-------------
 public | order_created_idx | index | postgres | orders | 2142 MB | 
(1 row)


As seen, the estimated and actual size is comparable.

I hope it clears the usage of the hypothetical indexes in PostgreSQL. In one of our blogs, we learned about why the index is not working and also how to check on which tables the index needed.

In our future blogs, we will share how you will get to know the exact index you need to make in the database.

Stay tuned to hello worlds.

Database PostgreSQL Java (programming language)

Published at DZone with permission of Sahil Aggarwal. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • The Generic Way To Convert Between Java and PostgreSQL Enums
  • Generate Random Test Data in PostgreSQL
  • Database Connection Pooling at Scale: PgBouncer + Multi-Tenant Postgres (10K Concurrent Connections)
  • Why PostgreSQL Vacuum Matters More Than You Think

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook