Recently I’ve got a question from a coworker.
We have a table credit_reports, which have among others the following fields:
The table contains over 51,000,000 rows. There are several indexes for this table, among them:
idx1 – (customer_id, inquiry_time)
idx2(inquiry_time::date) – the timestamp converted to date
A coworker was wondering, why idx2 was never used, even when we use this transformation in the selection criteria, like:
SELECT * FROM credit_reports WHERE report_type=’type1′ AND inquiry_date::date > current_date-5
But if we omit the transformation:
SELECT * FROM credit_reports WHERE report_type=’type1′ AND inquiry_date > current_date-5
then idx1 will be used (although it is not on the first position in the index)
The answer is the following. idx2 is never going to be used (except of queries like SELECT min(inquiry_date::date) FROM credit_reports) because the first record was inserted 7 years ago, which means, there are about 2,500 distinct dates in the whole table, which makes at least 20,000 records per one index value…. no sane optimizer would choose this path!
As for the usage of idx1, it was used not exactly as an index, but rather was read sequentially, because it is faster than reading the whole table!