Monthly Archives: July 2014

Let’s talk about normalization…

Never in my life I would imagine myself writing a blog post about such a scholastic subject, as normalization.

First, we all know what normalization is, we did study the normal forms in our first Database theory class (or SQL Primer, or whatever). Second, after we learned about them, we happily forget all the definitions – because why in the world we may ever need them?!

Well… that’s what I see often happening these days. Fist, I see that the word “normalization” is widely used to describe the things, which are not. And not like it is something to care about, but… but here comes “second”. And second – people often say “this schema is bad, because it’s not normalized, thereby we should normalize it”

This being said – let’s talk about normalization.

We do not really need to know the strict definitions of the first, second or third NF; what we need to know – why they were invented, what they do, and what problems they help to solve (so, if somebody tells you otherwise, ask them, whether they can tell the differences between 3NF, BCNF and 4NF:)). And the problems they help to solve, are the 3 anomalies – INSERT, UPDATE and DELETE anomalies. That’s how redundancy manifests itself. We do not want to deal with these anomalies – we normalize the schema. We do not normalized for optimization purposes, neither we normalized for the sake of saving space – actually,normalization can either increase or decrease the total disk space usage.

In process of normalization we remove some types of functional dependencies between relation attributes. At the end we want all attributes to depend on the relation PK only, but at the same time we do not want to continue to decompose a relation into smaller relations, if there are no “bad” functional dependencies left already. If we do so, we will end up with the “vertically partitioned” schema, rather than the “normalized” schema. It’s important to remember, that when we create a lookup table, for example, a lookup for loan types, and instead of having the words “payday”, “installment”, “line of credit” in the loan_type attribute, we have loan_type_id as 1, 2 or 3, which refers to the table loan_types (1 – payday, 2 – installment 3- line of credit) we actually do not normalize anything.

We do “vertical partitioning”, or, if we want to operate in the realm of relational model rather than an SQL model, we define a domain of values. Which can be done either by creating the “lookup table”, or by introducing a check constraint. If we want to attach some other information to the loan type, it will be a different story, because then we’ll have deal with INSERT-DELETE-UPDATE anomalies.

Continue reading

Advertisements

Leave a comment

Filed under SQL

Why in the world I did it?

Next day after I was assigned my new role at Enova, some of my friends asked me, or rather stated: now you won’t have time for programming! How do you feel about it? Won’t you miss the real work? And my old friends probably do remember that I used to say: I do not want to be a manager exactly for that reason – that I do not want to give away the actual work.

So – why in the world I did it?..

First, I didn’t quite give away the “normal work” entirely, I just program less than I used to. Second is more interesting. Yes, I love to produce “the real stuff” that works. But even more important, or at least equally important is to see the results of your work implemented in the “real life”, to see your stuff working. And very often, when you do not have any authority it’s difficult to convince people to execute your way, even when they trust you and believe you, that “your way is the right way”.

In other words, as one of the OMD applicants put it in her essay, “I’ve realized, that if I do not want to play by other people’s rules, I need to become a person, who makes the rules”. Or, yet another way to say the same thing – the way I said to one of my former co-workers: I have to do “big things” in order to be able to do my “small things”.

Or yet another way to say it: my new position gives me a way to make “real things” actually real!

Leave a comment

Filed under People, Team and teamwork, Workplace

My new role at Enova

I do not even know how to start… For the past two months, or even more I took over a new set of responsibilities, or, to be more precise, starting from the very simple statement: I want my job as a database developer for my squad to be done right, I gradually drifted into assuming more and more responsibilities, because in order to do my job right, something else should be done right, and… you got the idea. I’ve ended up talking with dozen of people from different departments, calling the meetings “just to clarify one question”, storming into other squads meetings, “because you are going to do something with the database, and I do not know, what exactly”….

So now I am officially “a Technical Lead for all Database Developers within the Software Engineering Department”. I’ve invented this role myself. Invented, because I felt a need for such a role. Invented, because my co-workers were asking me to take a leading role in the database development.

I’ve being doing so much of all this recently, that at first I didn’t see something really changing: I had my dynamic list of action items, I continue to carry on the same projects, big and small… yet on my first day in this new role I’ve spent nine and a half hours in the office, including lunch, during which I was teaching my class. When I was on the train back home reflecting on today’s day, I’ve listed all meetings and other interactions with coworkers, and small projects I was able to complete today, and they totaled to 20! I was completely exhausted by the end of the day, but I really felt I did lots of good stuff.

So, what have changed? The fact, that now it’s not an option and not a matter of my choice – whether I am taking over the “big things”. Now I just can’t stop doing those “big things”. It’s kind of frightening and exciting at the same time.

I know exactly, what I want to accomplish in my new role: I really want to bring the quality of our applications to the all new level. To promote the highest standards of development. And I know, what I am really good at (in addition to the database optimization :)) – I am good at talking to people, explaining why I am making my decisions, why I believe, that “my way is the right way”.

I am also thinking, that I’m immensely lucky to have all the trust and support from the majority of our developers. Nothing would work otherwise.

Also, I can’t stop reflecting on our never-ending discussion regarding gender inequality in IT. During the past several months I’ve learned to be persistent, to promote myself shamelessly… nothing I ever did before. And if I won’t I would never find myself in a position like my current one.

I guess… good luck to me.

8 Comments

Filed under Companies, People

Sometimes people just do not know what are they writing about…

Some time ago my manager shared with me a couple of links about “current state of data and databases”. One of them turned to be a perfect illustration to how we get where we got. This article The definitive guide to the modern database actually describes the current status of the databases, and what it out there.

One paragraph caught my attention:

This is the case when database engine and application are scaled-in together on the operating system level, so both of them use the same data without transmitting it through proxies. Also, a benefit of such approach is the increasing ease of use — monotonic code to deal with wrappers is no longer needed, and for some users this can become a more decisive factor than performance.

The funny thing is, that this is EXACTLY what led to our performance problems in my company!

Granted, the author never experienced how bad an OO code “can” become

2 Comments

Filed under Data management, Systems

How I managed to beat the Postgres optimizer…

Remember my post about 3 weeks ago where I told you guys, how I optimized “everything” for about 1,500 times? Well, apparently, it was not enough :).

Specifically, I’ve managed to optimized several reports, which ran often, each taking up to 11 minutes(!) to about 60 milliseconds. However, there were several other reports, which also ran quite often, and ran… well, for about 1.5 – 2.5 sec. And I knew, they could run faster!

So as a first step I’ve decided to create a function, which as a first step would pre-select the id for a specific report type and report field. This would allow me to avoid two joins in the final SELECT, which when generated, looked like this:


select distinct person_id from
reports r
join report_values rv on r.report_id=rv.report_id and report_type_id=9
and rv.report_field_id=250
and rv.value in
(select rv1.value from reports cr1
join report_values rv1 on rv1.report_id=r1.report_id and
rv1.value '' and r1.report_type_id=9
and rv1.report_field_id=250
and person_id=p_person_id)

Continue reading

1 Comment

Filed under Data management, SQL