One “emergency optimization”

Why it’s an “emergency optimization”? Well, because you write some SQL, you debug and optimize it, and it works fine, and it runs in production ten times during the business hours producing some report… and all of a sudden it just stops working… well, not necessarily “stop”, but start to run more than a hundred times slower… then you need to drop everything you are doing and optimize it ASAP. Because it’s already in production!

Last Thursday I had several things which “stopped working” in a manner described above. I’ve written about one of them in my previous post, but that one was relatively easy to fix. The next one looked as trivial as it can be, but stll I could not make it work!

Here is how the query looked like:

SELECT ...
   FROM payments p WHERE payment_date< current_date 
        AND payment_id NOT IN (SELECT payment_id from bank_records)
        AND...

There were more conditions, but they are not so important. This worked fine until a couple of days ago. What could be more simple? The payment_id field is indexed in both tables, and once again, this query should return a very small number of records. Also, the tables themselves are small, a little bit over 100,000 records. However, all of a sudden it started to run for minutes! When I looked at the execution plan I saw, that instead of doing a merge join utilizing the index, Postgres decided to read the whole bank_records table into the main memory, “materialize”, and then loop through this data set for each and single record of the payments table!

When I saw this, I immediately rewrote the SQL in my favorite way – with OUTER JOIN:

SELECT ...
   FROM payments p 
        LEFT OUTER JOIN bank_records b 
        ON b.payment_id =p.payment_id AND payment_date< current_date 
        AND b.payment_id IS NULL 
        AND...

This didn’t help either – still sequential scan and line by line comparison. And then I decided to use “the last resort”- which I hate!

SELECT ...
   FROM payments p WHERE payment_date< current_date 
        AND NOT EXISTS (SELECT payment_id from bank_records b
        WHERE payment_id=p.payment-id)
        AND...

This SQL just magically started to use the index, and the execution time when down from8 minutes to 300 milliseconds…

Yes, it’s cool… but why?!

Advertisements

2 Comments

Filed under Data management, SQL

2 responses to “One “emergency optimization”

  1. Theoretically – yes, but in this case I believe this will result in an extra join. Can you show me how you would rewrite this particular SQL with except?

    And I am still not getting, WHY. It should all work the same.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s