Refactoring of the “database code”, by which we usually mean refactoring of functions and stored procedures is one of the most confusing and complicated topics in the world of data management, mainly because of the facts that 1) it’s often impossible 2) if it is possible, you have to do precisely the opposite of what you expect when you think about “refactoring”.
Why and when you usually refactor the code? When you have some “common piece”, which does “the same thing”, which “can be reused” and thereby “should be isolated into a separate function”.
Why and when we use the database functions? When we want to make it easier for the user to execute a complex SQL statement, most often – to substitute multiple SELECT statements with one, either generating the statement or calling it with a parameter, and in any case “retrieving all data in one shot”. Which means, that the stored procedure or function should be customized for a very specific need of a particular query. In most cases we can’t reuse the functions, we can only reuse the principles and approaches . Each time you try to break a database function into smaller pieces, you actually destroy the whole purpose of this function.
So – does it mean that writing a db function is always a craftsmanship, and it always should be “build to order”, and you can never reuse what you wrote before? Yes and no. Last week I had to perfect examples of how you actually can reuse a db function – you just need to thing a little bit differently about it.
In both cases there was one of my relatively small but highly efficient functions – loan_payments. This function returns a set of records which contain payments for a loan, which number is defined by the function parameter. All our applications need this functionality, and the currently used methods are super-inefficient, since there are multiple database calls executed for each payment for a sole purpose to determine, whether it was processed, cancelled, returned, and what is the status code.
I am totally in love with this function, because it’s so nice, small and elegant, and produces a tremendous performance improvement, so when another team wanted to use it another application, I gladly blessed it.
You know, what happened next from my previous post, and let me stress, that the application developers involved in this project did exactly what is expected from the good app developer, but the opposite of what we need from the database developer perspective. Fortunately, we figured out the problem quite fast, and through last week I got a number of requests from this team to add more fields to the function output. Note, that adding fields didn’t hurt the function performance at all, and my own team didn’t need to change anything in our app. Our app just “does not know” about these extra fields.
The second situation was a little bit different. The db developer was researching a possibility of using the same function in another P&L. The problem was, that out of 7 tables involved in a join, which is the heart of this function, 3 should be different for this other P&L while the join logic is essentially the same.
Originally the database developer was going to create a new function, based on mine, or rather call another function inside the function body, if the P&L is different. But since all SELECT statements in our functions are generated (and turned out, it is better from performance perspective) I suggested to leave is as one function, keep the same select list (which is what we want anyways), to keep all complicated pagination as it is, and just when generating the FROM clause and JOIN conditions, to do it differently, based on the P&L. And this is effectively the “internal refactoring”, since we actually do reused the code, but within the same function and even within the same SQL statement :).