Let me tell you: there are so little tricks you can utilize when optimizing the OLTP queries, that each time I am writing something about this, I feel like I am repeating myself for the hundredth time… yet… people make the same mistakes again and again, and so here I am preaching the same thing again and again…
The original query which was brought to my attention last Thursday looked like this:
cs.id as c_id
approvals apr inner join customer_sources cs_s on apr.customer_id = cs_s.customer_id
apr.processed_on >= app.processed_on - interval '30seconds'
and apr.processed_on = app.processed_on - interval '30seconds'
and apr.processed_on '2014-05-01'
AND app.processed_on <= '2014-05-01 00:01:00'
AND app.teletrack_report_id is not NULL
No, actually it did not look like this, since it was about 200 lines long, but that was the idea: there was a dozen columns (positions, select-list components – whatever you want to call them), and each of them was selected by means of embedding yet another select, which would query the same pair of tables.
And note, that the inner copy of customer_sources was totally unnecessary, since the “approvals” do not correspond to specific leads, but to the customers.
The original timings were like this: when a coworker, who wrote this query, was trying to run it for 1 min time interval, it would not return results in 10 minutes, and then he would kill this query suspecting, that something is going wrong.
There is no magic in how to deal with these types of queries. Just imagine yourself in the place of a database. Would you like to read the same table again and again, each time selecting only the records of one kind?! Never! What you would do, you would read this table record my record, and sort the records, while reading: this one goes to this pile, this one goes to the other one, and this one can be thrown away, it’s not good for anything.
So, please! advice your database to do exactly the same! Like this:
cs.id as c_id
sum (case when cs.source_type_cd like '%leadpile%'and apr.customer_id is not null then 1 else 0 end) as leadpile,
sum (case when cs.source_type_cd like '% 9g%' then 1 else 0 end) as nineg
inner join approvals app on cs.customer_id = app.customer_id
and cs.created_on > '2014-05-01' and cs.created_on '2014-05-01'
AND app.processed_on = app.processed_on - interval '30seconds'
and apr.processed_on < app.processed_on
In course of our discussion with the report developer we also came to the conclusion, that we only need the “recent” customer sources, not the ones, which are 4 years old. Also, we ere able to put the correct interval right away, because – guess what? – this SELECT is executed in 15 seconds!