When things do not work like you expect them to work…

This post has a history. I’ve done the original optimization about three weeks ago, then I wanted to cleanse my example from the company-specific things, then I could not reproduce the original execution plan, and then at the end, when I was ready to publish it… it stopped working in our production system! By “stop working” I mean, that it started to take about 90 seconds to execute…But at the very end, by the time I was done with optimization for the second time, I guess, I was able to formulate some basic principles optimization of this kind of queries… so, without further ado – let’s start!

Each time I talk about optimizing join in Postgres, I advise people to include any additional selection criteria into join condition, even if they are, technically speaking, not “join” conditions. The reason for this suggestion is, that if the optimizer would see any way to pre-filter the records before  joining or while joining, it will be done.

So I was trying to utilize this strategy while working on a query which would check for the following condition:

SELECT ...
FROM loan_contract lc
JOIN loan_details ld on lc.loan_id=ld.loan_id
     AND contract_date >= current_date- interval '10 days' 
WHERE (state ='CA' AND  rate>10) OR (state='MO' AND rate>15)

I knew that this SELECT should retrieve a very small number of records if any, and thereby I was thinking that if I add this condition to the JOIN, it will execute faster because Postgres will be able to filter out the records which are not needed before join… but I was wrong!

SELECT ...
FROM loan_contract lc
JOIN loan_details ld on lc.loan_id=ld.loan_id
     AND contract_date >= current_date- interval '10 days' 
AND((state ='CA' AND  rate>10) OR (state='MO' AND rate>15))

Instead of having nice and fast index access I saw a HASH JOIN in the execution plan, and the query would run forever! Then I thought to myself: well, you should know better, since you teach how to optimize these kind of queries. There is an OR operation, so in order to “help” the optimizer to filter out the records, you should add an excessive condition. So I wrote the following:

SELECT ...
FROM loan_contract lc
JOIN loan_details ld on lc.loan_id=ld.loan_id
     AND contract_date >= current_date- interval '10 days' 
AND state in ('CA', 'MO') 
AND((state ='CA' AND  rate>10) OR (state='MO' AND rate>15))

This did not help either – Postgres was just doing two scans of the

loan_details

table. Then I’ve taken a closer look and realized, that this SQL is a perfect example of the case when the AND operator restricts the output, not any of the conditions by itself, with AND operation executed on the fields from different tables. So the right solution was not to try to include this condition into the join condition, but just the opposite – to join the tables as is, and then select using the combined criteria in the WHERE clause.

***
So after I figured that out, I’ve returned to the original SQL (plus restricting to two states), which worked for a couple of weeks. And then all of a suddent it stopped. As I’ve mentioned in the beginning of the post, the execution would take minutes. If you are wondering, why – I’ve checked the execution plan! And that’s what was happening. Postgres has figured out that the conditions belong to two different tables,and so it just decided to pre-select all contracts from CA and MO and also pre-select all the loan details with rate > 10 AND rate >15, which didn’t do much good, as you can imagine :).

So after several attempts to make it work I had to retreat to the solution which I knew will work from the beginning, but didn’t want to use, because it does not look so pretty. The solution is to use UNION ALL instead of OR.

So the final SQL will look like this:

SELECT ...
FROM loan_contract lc
JOIN loan_details ld on lc.loan_id=ld.loan_id
     AND contract_date >= current_date- interval '10 days' 
    AND state ='CA' AND  rate>10
UNION ALL 
SELECT ...
FROM loan_contract lc
JOIN loan_details ld on lc.loan_id=ld.loan_id
     AND contract_date >= current_date- interval '10 days' 
  AND  state='MO' AND rate>15

This works with a speed of light, and will work the sae even when/if I will need to add all of the 52 states and the District of Columbia 🙂

Advertisements

Leave a comment

Filed under SQL

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s