ICDE 2016: Around NoSQL

The ICDE 2016 finished more than a month ago, and unfortunately I didn’t post even a half of the things I wanted to post about the conference. In the next couple of days I will try to explain why it happened (although some of my friends already know). But there was one topic I really wanted to cover, so I decided to finish this blog post even six weeks later.

The NoSQL.

OK, I understand that that’s a buzz word these days. Moreover, I am even aware of a limited number of situations, when using a NoSQL makes sense. But it bugs me immensely when I hear something to the effect yea, it’s not much you can do with the traditional RDBMS, in order to achieve high performance you need to switch to NoSQL. These statements are related to “oh, you need to build a Datamart – then you need to use Redshift! It’s specifically designed for datamarts! Or – when I was trying one third – party software about a month ago trying to see whether it will help me to consolidate data from different sources, I’ve realized while observing the error messages that they were using Hadoop in between!

So – when I was listening to the presentation NoSE: Schema Design for NoSQL Applications from the University of Waterloo, I could not stop wondering… The presentation was about creating virtual schemas for the noSQL framework, and the presented was and at the end I’ve asked the question. Or rather I made a statement:): I remember the times, when the relational databases were not the industry standard. I remember the times, when you would try to predict which queries would be the most frequent and model you sets or you hierarchy accordingly. And I also vividly remember, how horrible it was when it didn’t work right, and thereby I remember exactly what was the reason everybody embraced the relational model.

So my question is – why we go back in time?

You know what was interesting? That a number of participants of different age and different background approached me after this session and expressed their full support :))

Advertisements

4 Comments

Filed under Data management, talks

4 responses to “ICDE 2016: Around NoSQL

  1. Somehow I just came across the blog post now. I will start by saying that I completely agree that there are a limited set of scenarios where NoSQL databases are useful. However, I think those scenarios do exist and currently, the experience of trying to design a schema for those scenarios is pretty messy. The goal of NoSE is to try to improve the situation for those who have decided to make the move to NoSQL.

    That said, I think there is also something to be said with providing a higher-level data model on top of a simpler data store which is another way of looking at what we’re doing with NoSE. NoSQL systems were to a large extent design to address scalability issues with relational databases that existed at the time, but the relational model is still incredibly useful.

  2. Thank you for stopping by and commenting! I do not really promote this website, so no wonder you didn’t come across this post earlier. I’d like to comment on literally one word you’ve used here: “people DECIDE to make a move to NoSQL”. We (collectively) work in a very special field – it is definitely a science, but it’d fruitless when not connected to practice. Thereby I’ve learned a long time ago not to use “I want” for justification of my technical decisions. Because it’s not what “we want”, but “what is good” for a project, a company, a customer.

    I know a limited number of cases (tasks, situations, whatever) when I think the usage of NoSQL is justified, but in all of these cases there is definitely no need to have a schema definition over NoSQL. And this is to certain extent “by definition”. I am curious whether you can describe any particular example, when the usage of NoSQL database will be BETTER (perfomance – wise, resource-wise, etc.) and STILL you’d need a schema over it.

    • You ALWAYS need a schema for a NoSQL database. But let me clarify what I mean by schema. Your application needs to have some well-defined way of storing data. For example, in a key-value store you need to decide what the keys and values will be in order to make sense of any of the data. This is what I mean by “schema” even though the key-value store just treats keys and values as blobs. The choice of this schema is the most critical aspect of performance.

  3. yes, I remember that you actually meant the path, which was precisely what reminded me of hierarchical databases, which were no doubt the most performant at that time (30+ years ago). And that’s the rest of my comments (from the post) come from.
    If I remember correctly, there were lots of comparison between the SQL and NoSQL databases performance, results published; but same as in the case of medical researches which deal with the “goodness”of tea vs coffee, the results depend on who pays for the research :)))

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s