I just hate vertical partitioning!

I get that sometimes vertical partitioning makes sense. If you almost always need just a small subset of the table columns to be selected, then why make the database to read this one wide row? Or even worse – those thousands and hundreds of thousands of the wide rows!

But when/if you choose to vertically partition a table, it’s very important to remember that the new table is not just “another table”, but it is closely related (in 1-to-1 relationship) with the rest of the vertical partitions. Otherwise… some interesting things might happen…

Let’s say you have three tables:

  • loan_header
  • loan_contract_details
  • loan_dates

Each of them represents a part of information, related to a loan as an entity. This is a classical example of vertical partitioning. However, instead of all these tables having the same primary key. they were designed in such a way, that each of them has it’s own PK, and the PK of the loan_header is listed as a foreign key. Which means, that the constraint 1-to-1 got completely lost!

Now imagine what happens, if for a very long time the key for the loan_header will be generated from one sequence, and for the loan_contract_details – from another one, but they were identical, because of in reality there is only one loan_contract_details record. And for tons of reports  you need to join to these tables – by what? By loan_id? By their own id?  Turned out that even in the situation when there was a clear indication of what should be the join criteria, many people who were writing reports where joining by wrong id! And nobody ever noticed this wrongeness! Moreover, since the id’s were somewhat “very close”, it didn’t ever happen that somebody would receive reports without results – they would receive reports with wrong results!

… and you can only imaging how long it would take to figure out this sequence of events in the middle of the night, when the only information you have is something like “some loans have wrong owners”!



Filed under Data management, SQL

2 responses to “I just hate vertical partitioning!

  1. Looks like someone was terribly wrong in the very beginning of this long story…
    Well, I have dual part ID for all the users (Project + UserID) and Project has almost no significant meaning for ~6-7 years already, but all of ~1000 different requests to database has to make join on Project AND UserID, everywhere.

  2. The problem was that is a third-party tool, and the idea is that they can serve different client with different needs, and too generalized decisions are virtually always not the optimal ones 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s