This talk was scheduled to be presented in New York in March, and it was in my shortlist of “must attend.” I was also planning to have this speaker at my panel. The conference did not happen for obvious reasons. Still, I was excited to find out that this talk will be presented virtually. It is going to be awesome (I know the speaker:)), and I am looking forward to finally hear it!
Please consider virtually attending! Hope to “see” you there!
P.S. Yes, it’s the same date as Chicago PUG meetup, but different time!
Do you have some “dead” users in your production database? Users which belonged to some former employees, who are long gone, but their Postgres users still maintain their ghostly presence in your environment? Just be honest :). I know that all of us should have some process in place to drop a user when somebody leaves the company. Most of the time, we are doing it promptly. But sometimes…
Sometimes, when we are trying to drop a user, we get an error: that user owns some objects. And there is no “DROP CASCADE” option for a user. Moreover, most likely, you do not want to drop cascade because that user might own some critical objects you do not want to lose!
Yes, we should not have such situations in the first place, and you might even have a policy stating that objects should never belong to the individual users, only to the group roles. But things happen. Maybe, somebody was in a hurry fixing production issues. Maybe, something was created “temporarily.” In any case, you can’t drop this user. You need to reassign the objects the user owns to the correct user/role. You need to revoke permissions. In case you can understand what in the world these permissions mean!
Several weeks ago, I was facing one of these situations. I successfully reassigned objects and revoked permissions, except for a couple. These last ones were weird, and I could not revoke then no matter what! What I mean, I was issuing a revoke command, it would complete successfully, but permissions still stayed! Apparently, I did not know what I was doing :). I had to rely on the last resort – ask Grandpa Google. And I quickly discovered a command I didn’t use before: DROP OWNED BY. Looks like exactly what I wanted, almost DROP CASCADE, right? Wrong! Because in contrast to DROP CASCADE, it won’t tell you what exactly it dropped! When you issue DROP CASCADE in a transaction, it would tell you which objects it is going to drop, and you can make sure you are not accidentally dropping something you didn’t mean to drop. But this command executes in dead silence!
I can’t even describe how scary it was to actually press ENTER:)
I remember when I learned about JOIN_COLLAPSE_LIMIT parameter in Postgres. And I remember the unfortunate circumstances of that encounter. Specifically, it was set to eight, which caused a relatively simple SELECT to be executed in a completely wrong way.
Since I didn’t know anything about that parameter at the moment, it looked to me like the order of joins in the SELECT statements in Postgres matters, and it sounded very wrong and the opposite of what documentation said.
But then I learned about JOIN_COLLAPSE_LIMIT, and suddenly all the execution plans magically fixed themselves after I changed it for 12. I performed a lot of experiments back then trying to find an optimal value for that parameter, and at the end of the day I decided, that no matter how many tables I am joining, changing it to anything above 12 does not help. Either the execution plan does not improve (you have 16 tables in the join, and you think you change the limit to 16, and the query would fix itself – and it does not). Or, if you change it to more than 20, the optimization takes too much time.
However, a couple of weeks ago, somebody requested my help in query optimization. The problem was that it was not a single query, but rather a query framework, and a person would run several queries in sequence, each time adding or removing search criteria that could belong to any of the tables involved in this SELECT.
Yes, “if it would be me,” I would write the SELECT statement differently each time, but that was not something I could suggest to a person who asked for my help. That would be too complicated. I looked at the query, estimated the number of joins and suggested, that at the beginning of each session they would execute
And it worked!!!! I do not know what’s so funny in the fact that it worked as expected… but I was really amused 🙂
It might take several more days, till the new ll_bitemporal_correction function will appear in pg_bitemporal GitHub repo, but it is ready. I finished development and basic testing, and it works.
It means nothing for those who do not care about “the state of bitemporality”, but I know that some people care. For those people, I am happy to report, that finally I delivered what you guys have asked for a long time. Now, you can make bitemporal correction not just to the last effective interval, but to any stretch of time. It may cover several historical intervals, it may start in the middle. It can’t end i the middle, because I didn’t come up with any business case for such situation, but if there will be a demand, I am sure I can provide :). Overall – I am very happy about this result :). Stay tunes – I will post when it will be officially available.
On a different note – we are almost P12 compatible, only bitemporal referential integrity still does not work, but it will be fixed shortly.
Huge thank you for everybody who show interest in this activity:)
For those who are considering attending PG Conf NYC 2020: the last Day of the EarlyBird discount is January 31! The program (or at least most of it:)) is up, so you can decided which day(s) you want to attend in can’t make it for the whole week. There will be tons of interesting talks, I am already upset I won’t be able to be it three different rooms at the same time 🙂
For those who is interested in my presentations, here is the most complete information so far:
Boris and I will be doing the same Ultimate Optimization Training as we presented in Chicago in December. The even was sold out than, but it is not sold out yet in NYC :). Our training will be on the last day of the conference, Friday March 27
I will be presenting the talk about building local communities (specifically, how to build a Postgers User Group) on the second day of the conference
I won’t be presenting neither bitemporal library, nor NORM, but I was invited to talk at another event during the same week, where I will present the NORM talk (extended from my Cyprus presentation). I will publish all the details as soon as it will be finalized
Once again – there will be an amazing number of interesting presentations, and many experienced people to talk to (I already have a list of twenty:)), so please consider attending!
Before I went to this conference, I was resentful regarding the fact that the gap between applications and databases will never be closed. Even at the conference focused on both data engineering and software development, there was barely a place for me, and our talk barely got accepted.
I have to admit, I didn’t explore the program much before coming because I had never-ending work crisis, and we had to rework our presentation several times.
But when I took a closer look, I realized that I am way more interested in the SE sessions than in the database sessions. Day four, I could not miss a single moment, and I had several interesting conversations with the speakers.
It turned out that most of them were not even present on day one when I was giving my talk. And they said they would love to come if it won’t be on the first day.
Now I am wondering whether I did it right, never trying to present my work at the SE conferences. On the one hand, I am always saying that my success won’t be possible if I won’t have such an incredible backend team. On the other hand, I routinely say that inefficient programming is all application developers’ fault. That is not true.
One of the talks was about the refactoring techniques, and after the presentation, I asked the speaker whether he ever tried to consider taking into account the factor of accessing a database as a factoring criterion. He replied that one of his colleagues tried to explore this option, but found it challenging: queries appear to be so entangled, so difficult to extract, that it led to nothing. I told him about my work and suggested that we would love to collaborate if he will find it interesting. He said that he would take a look, and then also mentioned that usually, the database people are not collaborating. He mentioned the lack of constraints and unwillingness to use views. I said that views are horrible because most of the time they decrease performance. But they provide a level of abstraction, he suggested. I replied – the are better ways! He said – well, then give them to us!
This exchange made me think that I am not explaining myself enough when talking about NORM. And if I know that successful implementation depends on cooperation with app developers, I should advocate for them.
My friends and colleagues who follow me for a while are well aware of the fact that my papers about ORIM/ORM were rejected at several conferences. Also, it’s a known fact that this kind of submissions is not welcomed at the conferences in general. My recent work, which was implemented at Braviant Holding with what I would call a smashing success, was only presented at the Postgres conferences. Our three attempts to submit it to any CS/Academic conferences failed. At some point, it became a matter of principle.
We submitted the new version to SOFSEM 2020 in the hope that at DB/SE conference, the topic could feel more appealing. The submissions formate required us to reduce the size of the paper almost twice, but we were still hopeful we would be able to convey a message. That ended up being only a partial success. The paper was accepted, however, only as a short paper. Reading the reviews, we realized that none of the reviewers understood what it was about (in fact, we got much better reviews fro VLDB, were we got weak accept, which ended up being not enough). To make the paper even shorter, we had to remove all the actual work from the body:), leaving there only the motivation and the conclusion. But as I said, it was the matter of principle.
Then we spend quite a bit of time reworking the presentation. First, Boris wanted it to look more academic, that my older Postgres presentations. But then we started to change it back to more and more visual since we realized that the topic, the motivation, and the reality are pretty much unknown to our audience. Besides, my Postgres presentations were 40-50 minutes long, and here we had only 30 minutes, including questions. I tried very hard to get it to 23-25 minutes.
The last blow came two days before the conference when we were told that the presentations for short papers should be 18 minutes long, including questions, which meant I had only 15 minutes to present. That did not sound feasible, and we were desperate. Boris tried to negotiate our twelve minutes back with the organizers:).
But then a sort of miracle happened. Right before our session began, it turned out that the first person didn’t show up, and I claimed all the time I could. Unfortunately, I already cut several slides out of my presentation (I put them pack in the pdf, which I previously published here). After all, I was able to bring my point across, and I think that everybody present there, enjoyed the presentation. How I wish I won’t be presenting on the first day because then I `would convince more people to come. But in any case, I hope that now we will be able to publish a full version of the paper somewhere.
A preface: as it usually happens with my favorite topic, neither database conferences, nor software engineering conference acknowledge that kind of research. Thereby, when although the paper got accepted, it was accepted as a short paper (because nobody understood what it was about :)).
Moreover, two days before the conference we were asked to shorten the presentation to 15 minutes. Fortunately for me, there was one no-show at our session, so I was able to present in full. Below is a full presentation version including two slides which were not in today’s presentation.
For those who didn’t get a chance to get to our Ultimate Optimization Training in December: we are doing it again at PG Conf New York March 23-27. This is definitely not the only reason to attend the conference, but somewhat an extra bonus :). Please go to the conference page for the details: PG Conf 2020