This is very much like finding Pluto! At Braviant, we use several external service providers to perform some business tasks. And then, as I’ve mentioned in one of my presentations about our usage of foreign data wrappers, we need to manage data, when we do not really own the data.
But this time around the task was even more complex, and I’ve spent weeks trying to figure out how to approach it. There is one Really Large Table on the “other” side, and to refresh the Data Mart, we need to select a small subset of records each time, basically “all records starting from the moment we refreshed last time”.
For some reason unknown to me something on the way from “them” to “us” did not work, and we could not push the condition to the external site. No matter what I was selecting, what was really happening (I’ve figured it out by observing the query behavior closely) – the whole table was fetched from the third-party server, and only then the selection criteria was applied.
The problem looked unsolvable, because “everything worked on the other side”. Then I cam up with one crazy idea. I thought: if we can’t push our condition through, may be we can create similar condition on the other side.
So, I’ve asked our service provider tech support, whether they can create a view on their side, which would restrict the size of object, I am selecting from, Note, I’ve asked for just a view, not a materialized view. So it was literally “query is executed locally”. And then I’ve mapped this view to the foreign table, so there was no changes to reporting.
Yes, this view has way more records than I need (it contains “last 24 hours”), while I refresh data every two hours. However, now I select from way smaller data set, because the view contains only last 24 hours, not the last 2 months!
… and now tell me, which optimizer would be able to execute this kind of optimization?!