Beware of NULLs!

Once again, my apologies for those of my readers, to whom this is trivial… but I think that sometimes something which is trivial to the just-out-of-college person becomes sort of faded after years in industry…

Nevertheless, here is what happened the other day. One of my co-workers rightly noticed, that a medium-size SELECT statement was not using the proper index because the condition was:

field1 =NULL

The problem was actually more than not using the proper index; this condition was a part of a relatively big OR expression, so if this one was not using any index, the bitmap OR-ing could not be used! The proper solution would be one of two: either to use IS NULL or to use =”, in both cases the index will be used as expected.

When this solution was proposed to the query owner, he asked us why Postgres optimizer “does not understand” what it should do. And here comes this “obvious” part, which I want to re-iterate.

As we all remember from our “Introduction to RDBMS” college course, we should never use =NULL, because this is “undefined” expression, which means, that any DBMS can implement it differently, and implementation does not have to be consistent between versions. The correct syntax is “IS NULL”. Granted, we all are used to the fact, that most of industrial databases have some deviations from ANSI standard. That’s why we always need to test, how it actually works.

So, here we are:

select substring ( '222' FROM 4 FOR 7 )=NULL

returns boolean NULL; and


select substring ( '222' FROM 4 FOR 7 )=NULL OR ('aaaa'='bbbb')

also returns boolean NULL, but the query “meant” to return ‘true’ in this case!

Note, that if we the second condition would be true, the whole expression would be true, since “undefined” OR true is “true enough” 🙂

Actually the more interesting thing is, that even with IS NULL this won’t work as expected! Try

select substring ( '222' FROM 4 FOR 7 ) IS NULL

and

select substring ( '222' FROM 4 FOR 7 ) =''

The first one will return false, and the second true! You know why? Because in Postgres the empty string is not the same, as NULL string!

The moral of the story: trust, but verify, and – be very careful with NULLs!

Advertisements

Leave a comment

Filed under SQL

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s