The World of Data

May 8, 2024 · 6:39 pm

I will be speaking at POSETTE: An Event for Postgres!

Tuning Parameters vs. Tuning Queries is yet another talk that “almost made it” to a couple of conferences but ended up being on the reserve list, and I never had to present it anywhere except for Chicago PUG. And one more time, my friends at Microsoft are giving me a chance:)

I used to be skeptical about online conferences because I had a couple of disastrous experiences, but not with that team! This year, I am happy to do it again!. On June 11, my talk, Tuning Parameters in Postgres vs. Tuning Your Queries, will be streamed at POSETTE 2024. Tune in on June 11 at 8-30 AM CT!

Leave a comment

Filed under events, talks

Tagged as conferences, PosetteConf, postgresql

May 5, 2024 · 8:22 pm

(Bi)Temporal Tables, PostgreSQL and SQL Standard

At PG Day Chicago, I presented an extended version of my talk given last year at Citus.con – Temporal Tables and Standard. Just between the time my talk was accepted and I delivered the presentation, I learned that PG 17 would include the first-ever support of an important temporal feature: uni-temporal primary keys and unique constraints.

It has been a while since the last time I presented anything temporal-related, which meant that many people in the audience hadn’t heard anything about the bitemporal model before. There was no way I could cover everything in 40 minutes, and many questions, which were asked both during the Q&A and later in the hallways, remained unanswered.

In this blog, I will address some of these questions and expand on what I would like to see in the upcoming Postgres implementation of temporal tables.

Bitemporal framework and GIST. The key feature of the bitemporal model is its dependency on existing PG extensions, specifically on GIST indexes and GIST with EXCLUSION constraints. In fact, the GIST extension does all the work needed to support (bi) temporal primary/unique keys: it ensures that there is no time overlap on any two distinct values of the primary key. In the bitemporal model, we check the same thing for the two time dimensions. For those who never needed GIST indexes, here is the relevant documentation. I learned about GIST when I first started implementing bitemporality, and I could not believe all my needs were already met!
Disk space requirements. For some reason, people believe that keeping all versions of each tuple requires “too much disk space.” I won’t deny that you need more space to store row versions than when you don’t; however, how much more is often overestimated. In my talk at PG Conf.EU 2022, I presented an example of storing changelog vs. storing data in a bitemporal model and demonstrated that it actually takes less space while allowing queries to be executed much faster.
Excessive IO. One of the questions I was asked was whether the use of bitemporal model increases the system IO. The answer is surprisingly insignificant. Let’s look at the database operations. INSERT is the same insert, whether it is temporal or not. The non-temporal update is equal to one INSERT and one DELETE. The uni-temporal UPDATE results in one INSERT and one UPDATE; in other words, two inserts and one delete. The bitemporal UPDATE is equal to two inserts and one update; in other words, it is equal to three inserts and one delete. That means the number of costly operations remains the same as with regular updates. Also, note one remarkable fact: the only field that changes in the updated record is the time interval. That means that 1) the record size is not going to change 2) since GIST index is an R-Tree, the intervals order is defined by inclusion. When we update the time range, the only thing we are doing with it is making is smaller (end-dating), thereby the node in the index will never move, which means that GIST indexes in this case will (almost) never experience a bloat. As for all regular B-tree indexes, all of the updates in the temporal models are HOT updates.
Why is the first temporal feature in PG 17 so significant? Having temporal primary/unique temporal keys in PG 17 might seem insignificant – after all, that’s what GIST with exclusion does anyway. However, one of my huge asks for many previous years was the ability to see temporal keys in a table description. I’ve invented a lot of tricks (mainly having an “empty” check constraint) so that I would be able to identify temporal tables using the \d psql command. Now, we can do it.
System time or application time? Now, I am going to switch to my questions and concerns about “what’s next” in the area of temporal tables support. When I first heard about the temporal key in PG 17 at SCaLE, I immediately asked the presenter when the second dimension would be added, to which he replied – very soon. We are actively working on it, and we are going to implement everything in the standard. That means, among other things, that Postgres should distinguish between SYSTEM_TIME (as per standard) and application time, and I do not see this distinction in the ongoing discussions.

Why is this important?

The SQL standard requires adding semantics to DML, adding a “FOR PERIOD” clause to all commands (and assuming CURRENT if it is omitted, to keep the old code functioning). However, for the SYSTEM_TIME, “FOR” is irrelevant, because system time is defined as “transaction time”, so it can only start at the “present” moment, and for any tuple in a temporal table, it can be either in the past or now. As for application time, it is not bound to a transaction, and “FOR” can be defined as any past, present, or future time period. In both cases, “update” is not a regular update but a sequence of inserts and updates, as I described in 2). And when we define temporal referential integrity, we need to take these semantics into account, which I still have to see. From my perspective, this test is not correct:

INSERT INTO temporal_rng (id, valid_at) VALUES
   ('[1,2)', daterange('2018-01-02', '2018-02-03')),
   ('[1,2)', daterange('2018-03-03', '2018-04-04')),
   ('[2,3)', daterange('2018-01-01', '2018-01-05')),
   ('[3,4)', daterange('2018-01-01', NULL));
 ALTER TABLE temporal_fk_rng2rng
     DROP CONSTRAINT temporal_fk_rng2rng_fk;
 INSERT INTO temporal_fk_rng2rng (id, valid_at, parent_id) VALUES ('[1,2)', daterange('2018-01-02', '2018-02-01'), '[1,2)');
 ALTER TABLE temporal_fk_rng2rng
     ADD CONSTRAINT temporal_fk_rng2rng_fk
     FOREIGN KEY (parent_id, PERIOD valid_at)
     REFERENCES temporal_rng;
 ALTER TABLE temporal_fk_rng2rng
     DROP CONSTRAINT temporal_fk_rng2rng_fk;
 INSERT INTO temporal_fk_rng2rng (id, valid_at, parent_id) VALUES ('[2,3)', daterange('2018-01-02', '2018-04-01'), '[1,2)');
 -- should fail:
 ALTER TABLE temporal_fk_rng2rng
     ADD CONSTRAINT temporal_fk_rng2rng_fk
     FOREIGN KEY (parent_id, PERIOD valid_at)
     REFERENCES temporal_rng;
 ERROR:  insert or update on table "temporal_fk_rng2rng" violates foreign key constraint "temporal_fk_rng2rng_fk"
 DETAIL:  Key (parent_id, valid_at)=([1,2), [2018-01-02,2018-04-01)) is not present in table "temporal_rng".

(It is very probable that the link that was sent to me does not reflect the current status, so I am holding off my judgment until I double-check; however, that’s a good illustration of the importance of operations semantics)

5 Comments

Filed under SQL

Tagged as bitemporal, postgresql

April 28, 2024 · 11:16 pm

Hey Chicago – We did it!!!

The PG Day Chicago 2024 is over. For the last two days, I have been reading my LinkedIn and X feeds, which are full of mentions of PG Day Chicago. Everyone is happy and thankful.

Although I am busy “liking” all of these posts—not just “liking” but truly liking them—I didn’t blog anything myself. Two days later, I finally organized my thoughts to talk about that event.

I want to start by saying a big thank you to everyone involved: organizers, speakers, sponsors, volunteers, and attendees! You were all essential to the conference’s success. And although there are definitely some things that I did wrong, I wanted to focus on things that I did for the first time his year and which, I hope, were done right.

The best decision I made was inviting Karen Jex to be a CfP committee chair. Her support was incredible, and I am forever thankful to her! (Of course, she’ll have to be the CfP char again :)).

We had fourteen student volunteers in addition to experienced staff, and everybody mentioned their excellent work
Three participants received Travel scholarships from Pg.Us Diversity and Inclusion committee
We distributed 6 vouchers from Postgres Women
We were able to redistribute six sponsor’s vouchers which they were not going to use and pass them to students.
… and all of the students were very thankful for the opportunity!

And that’s what makes me especially happy: the number of participants for whom it was the first Postgres conference ever and who recognized it as a unique opportunity to get to know the Postgres community, to be exposed to the incredible amount of knowledge, and to get inspired.

Leave a comment

Filed under events

Tagged as conferences, postgresql

April 24, 2024 · 7:17 am

Raise your children well :)

You know you’ve done well as a parent when your daughter knows how much better Postges would perform in comparison with Big Query:

Leave a comment

Filed under People

April 16, 2024 · 7:58 pm

PostgreSQL 17 features I am most waiting for

I won’t be able to attend Magnus’ talk at PG Day Chicago because it is scheduled at the same time as my talk, but fortunately, I attended his talk at SCaLe. There are several features I am really excited about, and I can’t wait to start using them!

Event trigger on login. Yes, yes, I know, Magnus told us it’s a footgun, but I promise, I know how to use it safely! I have so many potential usages!!!
Error handling in COPY. Since I started my new development with pgBadger, my biggest fear is that if just one record is somehow wrong, the whole file won’t load, and I can’t do anything with that – except for that now I can!!!
PgChangePassword!!! I have security-defined functions to handle that because what could be more natural than giving the user an option to change their own password?! Well… except for that, way too often, there is one user for the whole department… but we will address it later!
Redundant NOT NULL removal – that’s great!!! Such a time-saver! And again, what can be ore natural?!
Self-join removal!!! I can’t even tell how many times I fought with this problem starting from my Sybase days and all the way through my Oracle days! How many times I had to explain this “magic” to people! And now, finally, I will be able to remove this explanation from our performance book!
You might think that my favorite feature is the temporal primary key – well, it is; however, I have already started to worry about what’s next. The temporal PK is implemented just as it should be, but what I see in the future code temporal foreign key makes me worried. And nobody answers my concerns, so I do not know how loud I should scream!

Summary: if you didn’t hear this talk yet, please go to this talk at PGDay Chicago – I will be happy to lose this competition :))

4 Comments

Filed under news, SQL, talks

Tagged as conferences, postgresql

April 16, 2024 · 7:19 am

Recordings from SCaLe

I didn’t know that the tallks were streaming, and then I didn’t know that the recording were still available, and then I forgot to post! My talk is in the beginning of this recording.

Leave a comment

Filed under events, talks

Tagged as conferences

March 31, 2024 · 12:19 pm

Pg Day Chicago – the talks I am most excited about (part II)

Since one blog post was not enough, here comes part two of my persuasive essay about the upcoming PG Day Chicago.

I am very grateful to Bruce Momjian, who consistently prioritizes Chicago events:). It will not be the first time I listen to his presentation Beyond Joins and Indexes, but if time permits, I will be there in the first row asking my questions:).

Derk van Veen started presenting at Postgres conferences relatively recently, and each time, he shares real-world stories (or, should I say, technology mysteries?) and his journey to their resolution. I am excited to hear his talk How to detect and fix corruption.

Having recently focused on Postgres permissions and security, I am always happy to see more talks on that topic. Even though I have heard Ryan Booz’s presentation before, I am really happy that more people will have the opportunity to attend. The more people understand that assigning correct permissions in Postgres is not an easy thing, the fewer applications will be connecting to databases as postgres user.

Divya Sharma is a great speaker; her presentations are always engaging and attract a lot of attendees. Her talk at PG Day Chicago is about common performance problems, and I am very interested in her approach!

I have some (a lot) of experience in converting MSSQL Server and Oracle DBAs to Postgres DBAs, but I never heard of MySQL DBA becoming Postgres DBA, and I am wondering what the differences are – Dave Stokes is going to tell us in his talk.

There are still more talks I didn’t mention in these two posts, which does not mean they are less interesting! If you have your favorites for the upcoming PG Day Chicago, please leave your thoughts here!

Leave a comment

Filed under events, talks

Tagged as chicago, conferences, pgday

March 29, 2024 · 2:08 pm

PG Day Chicago – the talks I am most excited about (part I)

I am really excited about the PG Day Chicago schedule! As I mentioned multiple times, I am immensely grateful to our amazing program committee, especially to the committee chair, Karen Jex. Now, I want her to chair all our CfP committees (at least as long as I am in charge:)).

There is no talk in the whole schedule that I would skip. Unfortunately, even if I were not running around, it would still be impossible to attend three different talks at the same time. In any case, I wanted to share my favorites with my readers and give you another reason (or multiple reasons) to attend.

An Extension for Autocorrection of Planner’s Errors addresses a topic I am very interested in: there is a well-known area where PostgreSQL query planner is not great: analyzing long sequences of joins and estimating their cardinalities (or, should I say—misestimating). I developed the whole arsenal of tools to combat this issue, so I am very interested in new developments in this area.

I missed Claire’s talk about Amazing Postgres extensions at PG Conf NYC, but I will do my best to attend it this time. According to those who were able to attend, it was great!

I have known Andrew Atkinson for several years, and he is your dream application developer: the one that does not consider a database “just a persistency layer.” The one that understands what indexes are for, and most importantly, the one who develops applications that interact with a database efficiently. And in his talk, he will demonstrate the path to excellence.

If you haven’t gotten a chance yet to learn about the new features of PostgreSQL 17, make sure to attend Magnus’ A Look at the Elephant’s Trunk – there are many new features to be excited about. And if you already heard that talk at one of the previous conferences, then you can attend my talk, Temporal Feature in Postgres and SQL Standard. I submitted this talk to at least a dozen conferences, but the only time it got accepted was for the P99 online conference. I got extensive positive feedback, but it didn’t increase the acceptance rate:). That being said, I am very grateful for this talk acceptance, especially now, when we are so close to implementing bitemporality in Postgres, and I am so eager to make sure it is done right!

If you have ever met Dian Fay, you know that you should attend her talk! And if you have never met her, just trust me: that’s the talk you do not want to miss! Dian previously presented Exploring Postgres with Graphs at the Chicago PUG meetup in November 2023.

I also want to hear Jimmy Angelakos talk about the practical usage of PostGIS because I have zero practical experience, and customers are asking about it!

I attended the first presentation of Trying to be Barbie at PG Conf EU. I heard people questioning whether the “non-database” talks should be presented at Postgres conferences, and I say: absolutely yes!

Now I realize that I covered less than half of the schedule, so I am making this post “part I,” and “part II” is coming!

Leave a comment

Filed under events, People, talks

Tagged as chicago, pgday, postgresql

March 16, 2024 · 1:14 pm

The Security Talk Maiden Voyage

First of all here are the presentation slides:

security_scaled Download

The animation is off, so some things are definitely lost, but the essential part is there. Now that I have presented this talk to an external audience, I know what exactly I want to change! I am going to submit this talk (again!) to several conferences, and I am going to work on the “middle part. The problem I saw with this first-time presentation was that I knew the problem too well, and I should have highlighted the reasons for having more than one security model.

Still, I was happy with the questions and the fact that several people thanked me after the presentation, and I hope that this one will not be the last one! New and improved is coming!

Leave a comment

Filed under events, talks

Tagged as conferences, postgresql, scale

March 15, 2024 · 1:54 pm

At SCaLE!

Best speakers’ gift! And fortunately, only one of them is salty!

Leave a comment

Filed under events

Tagged as conferences

	Hettie D. on (Bi)Temporal Tables, PostgreSQ…
	Bob Lunney on (Bi)Temporal Tables, PostgreSQ…
	Hettie D. on PostgreSQL 17 features I am mo…
	Andreas Karlsson on PostgreSQL 17 features I am mo…
	(双)时间表，PostgreSQL和SQ… on (Bi)Temporal Tables, PostgreSQ…

The World of Data

I will be speaking at POSETTE: An Event for Postgres!

(Bi)Temporal Tables, PostgreSQL and SQL Standard

Hey Chicago – We did it!!!

Raise your children well :)

PostgreSQL 17 features I am most waiting for

Recordings from SCaLe

Pg Day Chicago – the talks I am most excited about (part II)

PG Day Chicago – the talks I am most excited about (part I)

The Security Talk Maiden Voyage

At SCaLE!

Pages

Recent Posts

Recent Comments

Archives

Categories

Follow Blog via Email

Meta