Monthly Archives: November 2013

How to save 15,000 database calls in 10 minutes

The other day I was looking at the statistical data – the number of execution call per controller action, which I try do if not every day, then at least every other day. To my surprise. I saw one “new” controller making it’s way to the top of list. Two things surprised me. First – the fact, that it appeared out of nowhere, and for the best of my knowledge was the simplest thing possible. Second – the the difference between min and max db calls per execution was ginormous – from 1 to over 15,000! (averaging 53 – that’s why it started to make a top).

So I asked the application developers, and they told me – there is nothing there, jus one select. I checked the Ruby code myself, and it was literally 3 lines, which were the straight ORM, and should have result in generating a straightforward:

SELECT * FROM notes WHERE customer_id=<?>

So – where are those 15,000 calls coming from?!

I’ve asked my teammate how we can find this out, and in five minutes we knew the answer: each of the “notes” returned by this SELECT statement, would generate additional SELECT statement, which would go like this:

SELECT * FROM employees WHERE id=<?>,

where id was an employee id from the note!

Please note :), that the association between notes and employees was already in the model, so the only thing the developer had to do was to turn on the eager loading, which would generate correct SELECT statement with one JOIN. The next five minutes where spend on making this change in the method, and testing that it produces exactly the result we expected.

So, after 10 minutes we were able to reduce the number of database call in this controller action from 15,000 … to one!

Leave a comment

Filed under Data management, Systems

Team Geek – a book review

I’ve recently finished reading a book Team Geek: A Software Developer’s Guide to Working Well with Others. This book was originally recommended to me by one of my co-workers, who made a short presentations about it during one of our “Buzzes”. So I’ve got a Kindle edition of it, but it took me a while to actually read it.

And I am glad I finally did it – liked this book a lot. What’s interesting, it talks about many things we recently implemented in our company, and in our squad in particular, and it made me to appreciate even more what our squad manager is doing. For example, the book tell how important is to keep the scrums short, so that they would remain real “stand-ups”. And I know that some other squads had grown into a habit to have their scrums like 30 minutes long or more. I now, that the members of those squads hate it. Originally I was not to say upset, but wondering, why our manager insists on all of us really standing during the scrum,  why he stops discussion about particular issues and says – we will discuss it after the scrum. But now I see, that those were right things to do. There are also other things from this book, which we do in our squad, like the way we track the issues.

But what I liked the most, was the fact, that the book is actually about how to work with people, how to work in  teams in the IT industry. And I can’t think of a more important topic. In the very first chapter of the book the authors say that the myths usually attribute development of some systems or products to a single person, like Linus is often thought of as a person, who single-handedly created Linux. But:

Actually Linus  just wrote  the beginnings of a proof-of concept Unix-like kernel, and showed it to an email list. <..> Linux is hundred times bigger than that and was developed by hundreds of smart people. Linus’s real achievement was to lead these people and coordinate their work.

The time of “single geniuses” is over, if there ever was one on the first place. And these days nobody is going to tolerate a difficult person just because of his or her individual abilities. It’s not worth if, if the person can’t work with others.

The book is full of excellent examples of how important it is to develop team, to develop team culture. I especially like how the team culture is compared to the yeast in a bread dough: if we have a right starter, the team culture will be preserved, even when none of the original team members will be there.  Another important chapter id the one which talks about the managerial skills, “even if you think you do not want to be a manager ever”.

Overall I think it’s a great book, which helps you to understand “how it all works”, and I strongly recommend it.

And it’s not that long at all 🙂 🙂 🙂

 

Leave a comment

Filed under books, Team and teamwork

Refactoring of the Database Code – Is It Even Possible?

Refactoring of the “database code”, by which we usually mean refactoring of functions and stored procedures is one of the most confusing and complicated topics in the world of data management, mainly because of the facts that 1) it’s often impossible 2) if it is possible, you have to do precisely the opposite of what you expect when you think about “refactoring”.

Why and when you usually refactor the code? When you have some “common piece”, which does “the same thing”, which “can be reused” and thereby “should be isolated into a separate function”.

Why and when we use the database functions? When we want to make it easier for the user to execute a complex SQL statement, most often – to substitute multiple SELECT statements with one, either generating the statement or calling it with a parameter, and in any case “retrieving all data in one shot”. Which means, that the stored procedure or function should be customized for a very specific need of a particular query. In most cases we can’t reuse the functions, we can only reuse the principles and approaches . Each time you try to break a database function into smaller pieces, you actually destroy the whole purpose of this function.

So – does it mean that writing a db function is always a craftsmanship, and it always should be “build to order”, and you can never reuse what you wrote before? Yes and no. Last week I had to perfect examples of how you actually can reuse a db function – you just need to thing a little bit differently about it.

In both cases there was one of my relatively small but highly efficient functions – loan_payments. This function returns a set of records which contain payments for a loan, which number is defined by the function parameter. All our applications need this functionality, and the currently used methods are super-inefficient, since there are multiple database calls executed for each payment for a sole purpose to determine, whether it was processed, cancelled, returned, and what is the status code.

I am totally in love with this function, because it’s so nice, small and elegant, and produces a tremendous performance improvement, so when another team wanted to use it another application, I gladly blessed it.

You know, what happened next from my previous post, and let me stress, that the application developers involved in this project did exactly what is expected from the good app developer, but the opposite of what we need from the database developer perspective. Fortunately, we figured out the problem quite fast, and through last week I got a number of requests from this team to add more fields to the function output. Note, that adding fields didn’t hurt the function performance at all, and my own team didn’t need to change anything in our app. Our app just “does not know” about these extra fields.

The second situation was a little bit different. The db developer was researching a possibility of using the same function in another P&L. The problem was, that out of 7 tables involved in a join, which is the heart of this function, 3 should be different for this other P&L while the join logic is essentially the same.

Originally the database developer was going to create a new function, based on mine, or rather call another function inside the function body, if the P&L is different. But since all SELECT statements in our functions are generated (and turned out, it is better from performance perspective) I suggested to leave is as one function, keep the same select list (which is what we want anyways), to keep all complicated pagination as it is, and just when generating the FROM clause and JOIN conditions, to do it differently, based on the P&L. And this is effectively the “internal refactoring”, since we actually do reused the code, but within the same function and even within the same SQL statement :).

Leave a comment

Filed under Data management, Uncategorized