Monthly Archives: August 2017

Optimizing something you can’t control

This is very much  like  finding Pluto! At Braviant, we use several external service providers to perform some business tasks. And then, as I’ve mentioned in one of my presentations about our usage of foreign data wrappers, we need to manage data, when we do not really own the data.

But this time around the task was even more complex, and I’ve spent weeks trying to figure out how to approach it. There is one Really Large Table on the “other” side, and to refresh the Data Mart, we need to select a small subset of records each time, basically “all records starting from the moment we refreshed last time”.

For some reason unknown to me something on the way from “them” to “us” did not work, and we could not push the condition to the external site. No matter what I was selecting, what was really happening (I’ve figured it out by observing the query behavior closely) – the whole table was fetched from the third-party server, and only then the selection criteria was applied.

The problem looked unsolvable, because “everything worked on the other side”. Then I cam up with one crazy idea. I thought: if we can’t push our condition through, may be we can create similar condition on the other side.

So, I’ve asked our service provider tech support, whether they can create a view on their side, which would restrict the size of object, I am selecting from, Note, I’ve asked for just a view, not a materialized view. So it was literally “query is executed locally”. And then I’ve mapped this view to the foreign table, so there was no changes to reporting.

Yes, this view has way more records than I need (it contains “last 24 hours”), while I refresh data every  two hours. However, now I select from way smaller data set, because the view contains only last 24 hours, not the last 2 months!

… and now tell me, which optimizer would be able to execute this kind of optimization?!

Advertisements

Leave a comment

Filed under Data management, SQL

Last Day to submit your talk proposals!

Hey fellow Postgres enthusiasts, especially a Chicago bunch! A friendly reminder, that today is a deadline for the 2Q PGCONF talks submission. If you didn’t submit anything, there is still time, just 1-2 paragraphs highlighting the main idea of  the talk  will be sufficient. Let’s show that Chicago CAN!

Leave a comment

Filed under events, talks

Why I like so much what I am doing

Many years ago, when I was about to graduate from the University, my not-then-husband asked me, what I wanted to do with my professional life: to write “smart” papers about how-everything-should-be, or to do something real? Because it was quite obvious, which answer he had expected at that time, I’ve answered: of cause, the latter one!

But speaking seriously, that was my goal through all my professional life. Yes, I do write the “smart” papers about how things should work, but all these discoveries are of little interest to me until I can make a practical usage out of them, and until I can prove, that what I think is right actually changes things for better.

I like to say, that “a database is a service”. There is nothing else in the world of information technology which is more remote from the end user, than the database internals. Our work manifests itself in a very not-so- straightforward way. And when the the absolutely theoretical approaches which I’ve developed, actually work the best possible way – there is nothing more exciting.

In the system which I am building right now, which is more than just an app, but the whole system, which includes interaction between different online services and the data warehouse(s) I am implementing all the ideas, which has been important for me for most of my professional career.

I am using the bitemporal model I talked so much about through the past two or three years, and it is fascinating to see that things I was hoping will work and have some value to the business, actually produce value!

I work with application developers to bypass the ORM, and to use the output of the database functions for the most efficient communication with the data storage. I did this many times before, but never before I’ve experienced that level of cooperativeness.

I am using the foreign data wrappers in a most extended manner, and literally eliminate the gap between the application databases and the data mart.

Everything I wanted to accomplish in different periods of my professional life – everything is coming together, and I can see that the results are coming out really … how I wanted them to be :).  And I can’t allow it to be different.

Leave a comment

Filed under Data management

Postgres conference in Chicago Nov 9 2017

It is my pleasure to advertise an event, which will be happening in Chicago in November:  2Q PGCONF 2017.

This conference is organized by the 2ndQuadrant, and it will be held in two locations: New York Nov 7-8 and  Chicago Nov 9. Participants can register and/or submit their talks for each of the locations separately.

If you ask me, I think that a one-day conference is great thing. It’s  much more doable, then several days conference, and your manager is way more likely to agree to you being one day away from work, than several :). This being said…

– Please consider participation (please register on the web site)

– Please consider submitting a talk (the deadline in Aug 22!)

– Please help us to find sponsors!

 

 

 

 

Leave a comment

Filed under events, news