Tag Archives: optimization

I knew I won’t like this feature! About the parallel execution

Many years ago, when Postgres 9.6 was still in making, my coworker said to me with excitement: Hettie, Postgres will now have the ability to run queries in parallel! There can be only four parallel processes, but still, isn’t in nice?!

And I remember exactly what I’ve replied: no, I do not like this feature a bit! You know why? Because executing queries in parallel would rarely solve performance problems. In fact, if four parallel workers would solve your performance problems, they were not really problems! It can do more harm than good, because it can mask some real performance problems for a while, and then they will turn just to be more severe.

What happened then – I’ve started a new chapter of my life at Braviant, and we had Postgres 9.5 then, and then for almost 3 years it was like I never had time to stop and upgrade :). But now I have a team, so we’ve finally planned the upgrade, and since we were already four versions behind, we planned upgrade to 9.6 and immediately to PG 10.

We’ve started from our Data Warehouse. First – in the staging environment, we we’ve tested, and observed the execution for some time. And then on the mirroring instance, and only then – on production.

And then it started! Seriously, out of all data refreshes this one is only one, which is important for the OLTP instance, because it sends data to one of our client interfaces. It started to behave inconsistently, Sometimes it would be just fine. Other times, instead of about 50 seconds it has been running for an hour, and probably won’t finish if we won’t kill it. Yes, it was obvious, that something did change in the execution plan after the upgrade. But what?! From the first glance the execution plan looked the same, all HASH JOINS, which you would expect, when you join tables with no restrictive conditions.

But it was still painfully slow. What was more puzzling – I could take out of the equation JOIN to any table, and performance would be unpredictable. After several dozen of attempts to make things run decided to take a closer look at the execution plan. And you know what I saw?! Yes, parallel execution/! Four joins were parallelized, which resulted in the execution time been really horrible. After the issue was found, the only thing left was to figure out, how to turn this feature off šŸ™‚

Advertisements

Leave a comment

Filed under Data management, Development and testing, SQL

One more time on the state of optimization

I just have to show some of the execution time graphs, and how the have changed after the optimized versions of the respective functions were deployed:

I know that many people are wondering looking at the second image, why I am striving to optimize things which are already running super-fast?

It’s not because I am trying to demonstrate my superpowers, it’s because i know for the fact, that with the database size we currently have, that is the right execution time. What does it mean? it means, that if the execution time is more than that, it indicates the wrong execution plan.

All these optimizations have been performed on our OLTP database, which means that all of these queries are “small queries”, retrieving a relatively small number of records. Which implies, that the appropriate indexes should be used, and that the execution plans should show the NESTED LOOP join algorithm. When I see the execution time of 500 ms, it tells me that there is at least one full table scan inside. Which in turn, means, that the execution time will be increasing, when the data volumes will be growing. Which is not good, if we are building a scalable system.

Another important thing to consider is that all these small queries cannot be “parallelized” to speed up the execution. We are in the OLTP environment, not OLAP. I know that I can’t rely on switching to the larger AWS instance, because 1) this process gets out of control very fast 2) does not help. Seen the execution times like on both of these pictures, like “I can’t see it” just proves, that the functions are performing as expected.

Leave a comment

Filed under Data management, Development and testing, SQL

New features are available in the bitemporal repo – and I am so happy about it!

I Really hope that most of my follows know something about the pg_bitemporal project, because if you didn’t hear about it, you won’t be able to share my excitement!

We started to build our bitemporal library for PostgreSQL about four years ago, it was merely a “proof of concept”, and Chad Slaughter, who initiated all this work, knowing my work habits way too well, was re-iterating again and again – do not optimize it yet!

Well, I didn’t, but then I’ve joined Braviant Holdings, and a year later I was granted a permission to use our bitemporal framework in production. Some of the performance flaws became apparent even during the test run, and I was able to fix them. Later, while we were using it in production more and more, I’ve come up with new functions, UPDATE_SELECT and CORRECT_SELECT, since we actually needed them, and since the bitemporal operations were supposed to behave the same way as regular database operations.

About three weeks ago we had a very important release, which along with addressing multiple business needs, included some significant changes on the technical side. One of the consequences was, that it significantly increased the traffic on our new planform, and as a result we started to see some timeouts.

Although these timeouts were pretty rare, we saw them as a problem. I personally pledged the system will remain scalable, and now I couldn’t just go with “bitemporal updates are slow”. Yes, the execution time was at 2 to 3 seconds most of the time, but sometimes it would spike, and our microservices have a hard timeout at 10 seconds.

Some time ago I’ve already mentioned in this blog, how thankful I am for those timeouts! Nothing else foster innovation more than a necessity to address performance problems immediately, because they have a direct impact on production.

This time around I was 99.9% sure that the periodic slowness happens during the remote query, which is a part of the problematic function. Turned out, though, that this 0.01% was the case, and together with our DB team we were able to determine, that the problematic statement was the last UPDATE in the bitemporal update function. If you’d ask me a week before that, I would say, that I am not going to address the bitemporal performance for the next several months, but I had no choice.

Thanks to Boris Novikov, who helped me immensely in testing and verifying several different approaches, and eventually identified the best one, and to Chad Slaughter, who was merging my commits from 7-30 AM to 9-30 PM, so that the master branch of the bitemporal library would have the latest updates by the time of the release, and thanks to our amazing QA team, who had to run and rerun tests that day multiple times, the new bitemporal functions are now on place. Not only for Braviant Holdings, but for the whole community.

I would also like to mention, that since I was already changing the functions, I’ve fixed one long-overdue issues: all functions have versions, which are PG 10 compliant. We’ve left the old versions there, because some of the are used in the existing production systems but if you are just starting, you can use the new ones.

Check it out at https://github.com/scalegenius/pg_bitemporal

Leave a comment

Filed under news, research, SQL, Team and teamwork

About one very controversial optimization

I meant to write about this optimization story for a while, not because somebody can use it as an example, but rather as a counter-example, a case of “what I’ve tough you so far is all completely wrong and not working”.

It all have started with the story which I’ve described in this blog post: the problem was, that indexes on the remote site were not being used, and in order to make things work I had to create a remote function, which would “inject” the constants into a remotely executed query and then too map the results. A couple of weeks later I had to face a similar problem which had also manifested itself only in production environment. But that time aroundĀ  the performance degradation was so severe, that we had to rollback the changes.

First I thought that I will be able to deal with a similar problem in a similar way, but turned out that was not the case. The SQL involving remote tables was a little bit more complex, and the function didn’t help. I’ve realized that I needed to explicitly pass some constants to the dynamic query, which meant that I had to perform a crime against relational theory, braking one perfectly fine SQL into two, and passing the result of the first selection to the second statement. This performed better, but still not ideal. I continued to follow the same path. It became better. I’ve repeated it one more time. And one more time.

At the end it was a textbook example of how not to write SQL. My own lectures were always filled with this “how it will not perform” examples: first you select a person, then using the person id you select position, then… But what I did to optimizeĀ  performance my function was the sameĀ  thing almost to the letter. I needed to select all loans for a specific person and some of the loan-related details. And instead of writing one SQL I ended up doing the following:

  • having a person_id, select all accounts
  • generate a SQL with all these user accounts embedded as constants to retrieve loans
  • generate a set of SQL statements, which would access remote information for each of the loans individually
  • and actually there was more in the middle

A part of the problem is, that in PostgreSQL functions are not really stored in a truly “compiled” way, they should beĀ  rather viewed as “stored sequences of actions”.Ā  And parametrized statements, unlike those in different systems,Ā  are not optimized properly. That’s why in most of the cases IĀ  build dynamic SQL inside of the functions. But in the case I am describing here,Ā  things were even worse. Even when I was executing a SQL statement for a specific person id, the indexes on user accountĀ  and loan where not really utilized (at least not always).Ā  So in order for me to push through the usage of indexed, I had to explicitly pass the values to the generated SQL…

It all worked at the end, and performance was amazing, but I can’t even convince myself to post code snippets – it looks ugly šŸ™‚

Oh, and another part of the equation is, that in all those SELECT statement were selecting a really small portion of the data from each of the tables, so there was not that many total executions.

The moral of the story:Ā  trust, but verify!

Leave a comment

Filed under Data management, Development and testing, SQL

Can you teach somebody to optimize?

I’ve got a lot of feedback on my last blogpost; one question was posted on another paltform where I’ve reblogged the same text, and this question was so interesting that I’ve decided to write a separate post in reply.

So tell me, Hettie, are these kind of discoveries being reported at the conferences, and would they later become a part of a common knowledgebase? Will this sort of technique be taught in colleges? Overall, in your opinion, are the nowadays CS graduates more knowledgeable in this area. And, by the way, is there any special knowledge which is necessary to be able to resolve problems like this, or it’s just a combination of basic knowledge plus experience?

Great question! I have being teaching optimization for almost 15 years, and in general my optimism on this subject is very modest. You can teach technique, you can show tons of examples, and still there is no guarantee that a student who has attended your class will be able to correctly identify a similar pattern in the real life and to recall what specific technique which was advised for a similar problem. But may be I am just not good in teaching this stuff.

It’s tempting to say, that all that matters are years of practice, but this is also not always the case, since you, and me, and all of us can recall the situations in which years of experience did not help. And to be honest, I do not want to wait for “years of experience”! I want to be able to hire a new grad, who can optimize at a reasonable level. And I am not saying I never met this kind of CS grads, but what I am saying is that whenever it happens, it is due to the person’s individual abilities and/or a desire to excel in this particular skill.

Let’s be clear: the kind of breakthrough as I’ve described in the previous post does not happen often. In fact, you might never get anything like this in your live. But there are still tons of optimizations which can be done almost every day.

I would still argue that knowing the basics is a key. For the thousandth time over – you need to know your math, your calc and your algebra in order to understand how databases work. You might not be aware of some sophisticated indexes, but you should be able to identify, looking at the query, what’s it about< whether it is "short" or "long". And if you try and try, and it does not become faster, you need all your convincing powers to convince yourself, that this query can be optimized. There should be a way.

Another big thing I am trying to teach – to write queries declaratively. This is an extremely challenging task, because most of the requirements are formulated in an imperative manner. So what I am trying to teach, is that even if you find something like “previous three occurrences”, or “return to the previous status”, or “compare the top two” in the requirements, you still can write a declarative statement, carefully using CASE, GROUP BY and sometimes window functions. And it’s amazing, how fast everything starts running right away. Most of the time being able to reduce the number of table scans to one does the trick, except of… well, except of the situation, when you should do exactly the opposite.

I didn’t figure out yet, how to teach to distinguish one from another :). But the more I think about it, there more it seems like that’s what signifies that somebody can optimize, and that skill is the most difficult to teach. Most optimization classes teach you how and when to use some indexes, and how to choose the join order, but they do not teach how to rewrite a query itself.

… For the original question: no, I do not think they teach it in school. But I am trying to promote this idea!

Leave a comment

Filed under Data management, SQL

Can you teach somebody to optimize?

I’ve got a lot of feedback on my last blogpost; one question was posted on another paltform where I’ve reblogged the same text, and this question was so interesting that I’ve decided to write a separate post in reply.

So tell me, Hettie, are these kind of discoveries being reported at the conferences, and would they later become a part of a common knowledgebase? Will this sort of technique be taught in colleges? Overall, in your opinion, are the nowadays CS graduates more knowledgeable in this area. And, by the way, is there any special knowledge which is necessary to be able to resolve problems like this, or it’s just a combination of basic knowledge plus experience?

Great question! I have being teaching optimization for almost 15 years, and in general my optimism on this subject is very modest. You can teach technique, you can show tons of examples, and still there is no guarantee that a student who has attended your class will be able to correctly identify a similar pattern in the real life and to recall what specific technique which was advised for a similar problem. But may be I am just not good in teaching this stuff.

It’s tempting to say, that all that matters are years of practice, but this is also not always the case, since you, and me, and all of us can recall the situations in which years of experience did not help. And to be honest, I do not want to wait for “years of experience”! I want to be able to hire a new grad, who can optimize at a reasonable level. And I am not saying I never met this kind of CS grads, but what I am saying is that whenever it happens, it is due to the person’s individual abilities and/or a desire to excel in this particular skill.

Let’s be clear: the kind of breakthrough as I’ve described in the previous post does not happen often. In fact, you might never get anything like this in your live. But there are still tons of optimizations which can be done almost every day.

I would still argue that knowing the basics is a key. For the thousandth time over – you need to know your math, your calc and your algebra in order to understand how databases work. You might not be aware of some sophisticated indexes, but you should be able to identify, looking at the query, what’s it about< whether it is "short" or "long". And if you try and try, and it does not become faster, you need all your convincing powers to convince yourself, that this query can be optimized. There should be a way.

Another big thing I am trying to teach – to write queries declaratively. This is an extremely challenging task, because most of the requirements are formulated in an imperative manner. So what I am trying to teach, is that even if you find something like “previous three occurrences”, or “return to the previous status”, or “compare the top two” in the requirements, you still can write a declarative statement, carefully using CASE, GROUP BY and sometimes window functions. And it’s amazing, how fast everything starts running right away. Most of the time being able to reduce the number of table scans to one does the trick, except of… well, except of the situation, when you should do exactly the opposite.

I didn’t figure out yet, how to teach to distinguish one from another :). But the more I think about it, there more it seems like that’s what signifies that somebody can optimize, and that skill is the most difficult to teach. Most optimization classes teach you how and when to use some indexes, and how to choose the join order, but they do not teach how to rewrite a query itself.

… For the original question: no, I do not think they teach it in school. But I am trying to promote this idea!

Leave a comment

Filed under Data management, SQL

How to deal with “First you write, then you optimize”

Recently I’ve made yet another attempt to optimize several quite inefficient report, which had been out there for so long, that nobody remembers, who have written them, and who have speced them out. I did not finish this task, even with the help of my team we just didn’t have enough time to spare on improving something, which “works”. And when I was thinking about it over the weekend, I thought that different people may have different definition of “works”, when it comes to SQL writing.

I always say – it’s easier to write the code correctly on the first place, that to optimize reports which are written inefficiently, but way too often I am getting a response from the stakeholders – we need it now, you can optimize it later. And then later never comes.

This is so obvious, that I would not spend time writing about this yet another 101th time. However this time my thoughts have taken a different path. I thought about why I was unable to finish this optimization, why I didn’t have enough time. The reason was, that something else which I was writing have taken way more time than I originally planned. And the reason why it took so long is that I find myself unable to write, even as a first draft, the code which I know for sure is suboptimal. No matter how many people will tell me that “it does not have to be perfect”. And then – yes, it takes longer.

Now I think that when I teach an optimization class, and when I am showing to my audience some cool technique, my thought process is that everybody should realize how much better this technique is, and than use “that and only that” approach, because how-else-this-is-so-obviously-better. And as for my audience, people think – ok. It’s cool. Good to know. And do not change there code -writing habits…

Once again, I might have been just ignorant, and may be everybody knew it all along, but it was a revelation for me: if I want to teach people good coding habits, if I really want to avoid this situation “first write, then optimize”, not only that I need to show people how to write a coode code. And not only I need to show how not to write. But I also need to teach zero-tolerance to the bad coding. So that people’s minds won’t even wander in this direction:). If it is not written the right way, it is not done.

I am pretty sure nobody would like this idea, but can I at least have a dream :)?

3 Comments

Filed under SQL