Category Archives: Uncategorized

Protected: 2Q PG Conf Ultimate Optimization Training Slides

This content is password protected. To view it please enter your password below:

Enter your password to view comments.

Filed under Uncategorized

My Ups and Downs at Urbansoft

Today, everybody is blogging “The Horror Coding Stories.” I didn’t plan on that, but then I thought that this is a perfect horror story (reposted from my personal blog).The context: me working at one of the first joint ventures in Russia in 1990s.

Hettie's Reflections

At the end of December, John went back to the US for Christmas. I was still working at that “it’s great!” project and on my makeshift database. And I came up with something cool. Something I was very proud of.

I did most of that work at home because it was time around the holidays. Although I did have a modem, that was before the times you could email a bulk attachment, so usually, I would compress my code with tar command and copy a .tar file to the diskette, and take this diskette to the office.

The next day John should have to be back, and I was anticipating my triumph. At about 9 PM, when kids were already long asleep, I started to make my final .tar file.

Nowadays, even some of the younger IT people might not know what the tar command does, yet along those of…

View original post 271 more words

Leave a comment

Filed under Uncategorized

What I am looking (and not looking) for

Since  I’ve been looking for  database developers and DBAs for quite some time now,  and since virtually everybody knows about this, people often ask me: what are you looking for? What skills and qualifications you are interested at? Who would be your ideal candidate?

Most of the time I reply: please read the job description. I know that all the career advisors tell you “apply even if you do not have some qualifications”, but as for my job postings, I actually need those qualifications which are listed as “required”, and I would really prefer the candidates, who have “is a plus” qualifications.

Also, there are definitely some big DON’Ts, which I wish I would never ever hear again during an interview:

  • when asked for the definition of the foreign key,  starting your answer from “when we need to join two tables”
  • when asked about normalization, starting from “for better performance”
  • when asked about schemas, saying that we use then for storage optimization

Today however, I was asked a different question: why you are saying that you are looking for skilled candidates, and at the same time you admit, that for anybody who will get hired there will be a long learning process? if a candidate does not know something, doesn’t it mean he does not have enough skills? Doesn’t it mean, (s)he is underqualified?

I thought for a while before I’ve responded. When I was first hired as a Postgres DBA, it was a senior position right away, although at that time I did not know any Postgres at all. But people who’ve hired me were confident that not only I can learn fast, but also that I can generalize my existing knowledge and skills and apply it in the new environment.

To build on this example, there are two pre-requisites for success: knowledge and the ability to apply it in the real-life circumstances.

I think, that a person who wants to succeed as a database developer or a DBA should possess a solid knowledge of the relational theory.  But it is not enough to memorize your Ullman or Jennifer Widom, you need to be able to connect this theory to the real-world problems. This is such an obvious thing, that I never thought I will need to write about it, but life proved me wrong :).

Same goes in the situation, when a candidate has a lot of experience with other database, not the one you need. Yes, different database systems may be different, and significantly different. Can somebody who is very skilled  Oracle DBA be qualified for a position of Postgres DBA? Yes, if this person knows how to separate generic knowledge from the systems specifics.

if you know how to read an execution plan in Oracle, and more importantly, why you need to be able to read it, you will have no problem reading execution plans in Postgres. If you used the system tables in Oracle to generate dynamic SQL,  you will know exactly what you want to look for in the Postgres catalog.  And if you know that queries can be optimized, it will help no matter what a specific DBMS is. And it won’t help, if the only thing you know is how to execute utilities.

… No idea, whether this blog post is optimistic or pessimistic, but… we are still hiring 🙂

1 Comment

Filed under SQL, Uncategorized

Don’t forget about Chicago PUG this Tuesday!

Attention fellow chicagoans! Do you want to learn more about PG Open 2017 straight from the participants? Come to Chicago PUG meetup on September 12!

Here is a https://www.meetup.com/Chicago-PostgreSQL-User-Group/events/242081721/link to the meetup – please RSVP there.

Leave a comment

Filed under events, Uncategorized

May PUG with Joe Conway!

I neglected to advertise our May event, and this is going to be indeed the most interesting meetup of 2017! Because just in two days, on May 19 Joe Conway will be speaking at Chicago PUG.

I definitely do not need to advertise him, but I am advertising the fact of his appearance in Chicago, and encourage everybody to attend.

Please RSVP at our Meetup page, and hope to see there.

Leave a comment

Filed under events, People, Uncategorized

Using the S3 .csv FDW for VERY LARGE files

As usual, when I am asked  – Hettie, can we do it? I reply: Sure! Because I never ever assume there is something  I can’t do  :).  But this time I didn’t even get a second thought: there was simply no doubt!

Since I’ve started to build a new Data Warehouse for Braviant in May 2016 I actively used Postrges FDW to integrate heterogeneous data sources. The last one I’ve incorporated was the said csv FDW for Amazon S3.

Speaking about csv files, anybody who ever worked with them knows that they are not as “primitive” as one might think. Postges documentation underlines the fact, that a file which is considered to be the .csv file by other programs might not be recognized as csv by Postgres and vise verse. Thereby when something is going wrong with a csv file… you can never be sure.

This time around the problem I was facing was that the .csv file was huge. How exactly huge – about 60GB. And there was no indication in the Postgres documentation of what precisely is the maximum size of the .csv file which can be mapped throught the FDW. When I’ve talked to people who had some experience of mapping large files, the answer was “the bigger the file, the more complicated it gets”.  To add to the mixture,  there were actually many different .csv files, coming from different sources, of different sizes, and I could not figure out, why some of them are being mapped easily, and some produced errors.  The original comment on “files over 1GB might have problems” didn’t appaer to be relevant, since I had plenty of files 1 GB or more which I was able to map with no problem.

It might look funny that it took me so long to figure out what the actual problem was; what I am trying to explain – I’ve got confused of what to expect, because I was more than sure that “size does not matter”, that it only matters in combination with other factors, like the cleanness of the format… and I wasted the whole week on fruitless and exhausting experiments.

Until one morning when I decided “to start from scratch” and to try starting from the very small files. And when I realized, that the file size is actually the only factor which matters, I found the real limit within 3 hours: the file has to be under 2GB to be successfully mapped!

And here another story starts: what I’ve done to avoid copy-pasting for 37 times.. but this will be a topic of my next post!

 

 

3 Comments

Filed under Uncategorized

ACM Hour of Code Dec 5-11

I am re-posting the ACM newsletter about thee upcoming Hour of code – please consider organizing something in your community!

 

Organize an Hour of Code in Your Community During Computer Science Education Week, December 5-11

Over the past three years, the Hour of Code has introduced over 100 million students in more than 180 countries to computer science. ACM (a partner of Code.org, a coalition of organizations dedicated to expanding participation in computer science) invites you to host an Hour of Code in your community and give students an opportunity to gain the skills needed for creating technology that’s changing the world.

The Hour of Code is a global movement designed to generate excitement in young people. Games, tutorials, and other events are organized by local volunteers from schools, research institutions, and other groups during Computer Science Education Week, December 5-11.

Anyone, anywhere can organize an Hour of Code event, and anyone from ages 4 to 104 can try the one-hour tutorials, which are available in 40 languages. Learn more about how to teach an Hour of Code. Visit the Get Involved page for additional ideas for promoting your event.

Please post activities you are hosting/participating in, pass along this information, and encourage others to post their activities. Tweet about it at #HourOfCode.

Leave a comment

Filed under events, news, Uncategorized