Check if you have an Index on your IDs in PostgreSQL

Published on Mar. 27, 2017 by Gabriel Bordeaux

We ID

Why?

PostgreSQL does not automatically create an index for your ID column. This article will explain why you should always create an Index on your ID columns and how to check if all your existing tables have one. You don't have a lot of time? Just click here to go directly to the final query.

How to proceed for new tables?

Let's say you create a table:

CREATE TABLE test (
   id serial,
   column1 text,
   column2 text
);

You should always create an Index for your ID afterwards:

CREATE INDEX ON test (id);

The associated PostgreSQL documentation can be found here.

Performance demonstration

I have a table with 1.25 million data:

SELECT count(*) FROM my_table;
  count  
---------
 1259722
(1 row)
Time: 540.203 ms

I want to update an ID but I do not have an Index on my ID:

UPDATE my_table SET update = now() WHERE id = 12345;
UPDATE 1
Time: 1213.705 ms

For the second test, I create the index:

CREATE INDEX ON my_table (id);

And I run again the sane query:

UPDATE my_table SET update = now() WHERE id = 12345;
UPDATE 1
Time: 2.108 ms

The exact same query went 575 times faster with the Index.

How do I check all my tables?

We just saw that everyone should always create an Index on the ID column for each table. It is (almost) never going to be useless.

But now, let's find out how to scan all the tables of our database to find out were did we forgot to put an Index of our IDs.

First, let's use the "information_schema" schema from PostgreSQL to find all the tables of our database:

postgres@db # SELECT table_name
FROM information_schema.tables
WHERE table_schema='public' -- Default is usually 'public'
      AND table_type='BASE TABLE';
 table_name 
------------
 test
 test_2
 test_3
 test_4
 test_5
(5 rows)

We can now use a second query to find all the columns for one table:

postgres@db # SELECT ordinal_position, column_name, data_type
FROM information_schema.columns
WHERE table_schema = 'public' -- Default is usually 'public'
AND table_name = 'test' -- the table we chose
ORDER BY ordinal_position;
 ordinal_position | column_name | data_type 
------------------+-------------+-----------
                1 | id          | integer
                2 | column1     | text
                3 | column2     | text
(3 rows)

Using both tables (information_schema.tables and information_schema.columns) we can now find all the tables with an ID (we don't need the tables with no ID) :

postgres@db # SELECT table_name
FROM information_schema.tables
WHERE table_schema='public'
      AND table_type='BASE TABLE'
      AND EXISTS (
                  SELECT 1
                  FROM information_schema.columns
                  WHERE table_schema = 'public'
                        AND table_name = information_schema.tables.table_name
                        AND column_name = 'id'
                  );
 table_name 
------------
 test
 test_2
 test_3
 test_4
(4 rows)

We just found out that "test_5" has no ID and does not need to be checked.

Now, let's find all the indexes for the table "test" :

postgres@db # SELECT a.attname
FROM pg_class t
     JOIN pg_index ix ON t.oid = ix.indrelid
     JOIN pg_class i ON i.oid = ix.indexrelid
     JOIN pg_attribute a ON a.attrelid = t.oid
WHERE 
     a.attnum = ANY(ix.indkey) AND 
     t.relkind = 'r' AND 
     t.relname ='test' -- table name
ORDER BY
     t.relname,
     i.relname;
 attname 
---------
 column1
 column2
 id
(3 rows)

Using the last 3 queries, we can now find:

  • The list of all the tables in our database
  • The list of all the columns for each table (to skip tables with no IDs)
  • The list of Indexes for each table

It's now easy to find out which tables have an ID column but no Index on it:

postgres@db # WITH tables_with_id AS ( -- list of tables with an ID column
                        SELECT table_name
                        FROM information_schema.tables
                        WHERE table_schema='public'
                              AND table_type='BASE TABLE'
                              AND EXISTS (
                                          SELECT 1
                                          FROM information_schema.columns
                                          WHERE table_schema = 'public'
                                                AND table_name = information_schema.tables.table_name
                                                AND column_name = 'id'
                                          )
),
tables_ok AS ( -- list of tables with an ID column AND an Index on the ID column
                        SELECT t.relname
                        FROM pg_class t
                             JOIN pg_index ix ON t.oid = ix.indrelid
                             JOIN pg_class i ON i.oid = ix.indexrelid
                             JOIN pg_attribute a ON a.attrelid = t.oid
                        WHERE 
                             a.attnum = ANY(ix.indkey) AND 
                             t.relkind = 'r' AND 
                             t.relname IN (SELECT table_name FROM tables_with_id) AND -- we seach only tables with an ID
                             a.attname ='id' -- we chack only Indexes on the column ID
                        ORDER BY
                             t.relname,
                             i.relname
)
SELECT CONCAT('CREATE INDEX ON ', table_name, ' (id);') as you_need_to_add_the_following_indexes
FROM information_schema.tables
WHERE table_schema='public' AND 
      table_type='BASE TABLE' AND  -- Default is usually 'public'
      table_name IN (SELECT table_name FROM tables_with_id) AND  -- the table have an ID
      table_name NOT IN (SELECT relname FROM tables_ok) -- the table is not in the list of tables with an Index on the column ID
;
    you_need_to_add_the_following_indexes     
----------------------------------------------
 CREATE INDEX ON test_3 (id);
 CREATE INDEX ON test_4 (id);
(2 rows)

That's it !