A need for reverse index

One more optimization problem I came across last week. I’ve spotted a slow-running query, and could not figure out, why it could be slow.

The query was rather simple, and if we take all the “extras” out of the equation, it would come to something like this:

 

SELECT l.loan_id FROM loans l
WHERE
AND l.loan_id NOT IN
(SELECT loan_id FROM payment_runs WHERE run_date=’2016-01-13′)

 

Now, the table payment_runs has a primary_key (loan_id, run_date) – and that’s basically all it has. So you would imagine, this index will be used for searching the loans – but it was not!

Actually, I’ve realized more or less right away, why it was not used: it starts from loan_id, thereby the b-tree search should already “know” a loan_id or a set of loan_id’s to start using the index. However, we need exactly the opposite – we need to find loan_ids, which are not present in the table, and there is no use to try to utilize the index, since the search criteria is based on the second value.

We ended up creating one more, “reverse” index on that table (run_date, loan_id). Although it looks completely redundant, we really need it – the table payment_runs is growing very fast, and there was no way to avoid a full table scan without this index!

 

 

Advertisements

Leave a comment

Filed under SQL

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s