Did you know anything about inline functions, which return the sets of records? I didn’t know anything about them until recently, when one of my co-workers discovered them trying to solve a problem, which has being a problem for a while…
Here is what the problem was. We have a table in our Data Warehouse, which is updated incrementally, since it stores historical and current data for all installments for all loans. All good so far, right? The problem is, that for reporting purposes we need to calculate lots of additional data, based on the current date – and this data needs to be calculated for all installments, for all loans. Obviously, it’s impossible to recreate all data daily, especially because the actual data is being refreshed every 15 minutes.
The option which seems obvious would be to create a view, which will use the “permanent table and calculate the rest dynamically based on the current date. Sounds like a perfect solution? Well, if you try to query this data for specific loan or installment, then – yes, you receive response instantaneously. But if you want to join this view with some other table – performance is terrible! And no wonder – not only Postgres, but virtually none database can put condition or parameter deep into grouping. Materializing the view would not help either, because such materialized view would soon become so big, that the calculations would run out of temp tablespace.
And then one of our co-workers asked: do you guys know about the inline functions?
You can read about inline functions here. The biggest advantage of using the inline functions returning record set is, that they are not the black boxes. For the optimizer the inline function body is directly substituted by the raw SQL, and the optimization is no different than of any “regular” query. In our case are now were spared of “calculate everything for all installments first”, and the query would be optimized for a specific use case.