Almost exactly a year ago I was dealing with the optimization problem I’ve described in this post, which in turn was referencing that post with my original problem. Recently I faced another problem of unexpected slowness, and a solution turned to be… kind of the opposite.
This time the slow query was running (or rather not running) in the Data Mart, in the classic star schema environment.
We had one fact table and 20+ dimensions joined to this fact.
So the query looked somewhat like this:
FROM fact_table f
INNER JOIN dimension1 d1 using (key1)
INNER JOIN dimension2 d2 using (key2)
INNER JOIN dimension20 d20 using (key20)
The problem was… this query was nor running! It would enter the query planner and never come back!
What I found about this cluster was, that the both
join_collapse_limit were set to 30, which paralyzed the optimizer almost entirely. I’ve started to experiment with reducing the join_collapse_limit and found, that what we actually had to do was to set it to 1. This allowed us to control the join order, which is perfectly fine: we know exactly, what the order should be: fact table first, followed by dimensions 🙂
The total execution time got reduced from “god-knows-what” to 34 ms! Which, I think proves that in many cases humans think faster that computers 🙂