About two weeks I wrote a post about dealing with very large csv files. The proposed solution was to break this very large file into smaller files and then merge together the results of mapping. This was definitely a doable solution, however since we had to break this one very large file into 37 smaller files, it involved having 37 foreign table on our proxy server, and another 37 foreign tables in the staging area of the data warehouse, and also 37 matching materialized views….
And if this is not enough, soon I found out, that for each marketing campaign the number of files may vary. And later I’ve learned, that the past campaigns should be stored permanently… although later this requirement was revoked. But some other results had to be stored forever…
I any case, the problem I was facing was that there is no way you can stay sane doing 37 copy/paste… or making changes in the 37 places… So I’ve realized, that I need to automate the process of the said foreign tables and materialized views creation.
Once again, there is nothing better than an ability to generate SQL inside the function and then execute it. You can’t make a table name a function “parameter”, but… you can, if you generate a DDL statement inside it. I was very happy when I’ve realized, that in the situation when requirements change so often it really does not make much of difference, whether you refresh a materialized view, or you drop it and create again. That’s what my refresh functions now do, and this allows to make changes to the logic really easy. Well… relatively easy. As easy as you can make it :).
Honestly if i won’t come up with my first automated generation about two weeks ago, I will be way more upset each time I need to make changes. But by know I’ve parametrized literally everything :). Tomorrow we’ll test, whether somebody except me can execute this super-flexible set of functions – because I am taking vacation next week – for real!