Category Archives: People

Chicago July PUG with Bruce Momjian

I have just one thing to say – it was a great meetup! I was worried, that the even was scheduled not for the best day: it’s the end of July, when people are either heading to vacation, or at least just trying to spend as much time outdoors, as possible. Besides, for some reason nobody likes to meet on Thursdays.

In spite all of the above – the attendance was great, and the audience was really engaged. Do I need to mention that Bruce is the greatest speaker in the Postgres world?!

The presentation was brilliant,  over an hour just passed by, and nobody even asked whether we can have a break 🙂


And now I really hope  this won’t be the last time Bruce is visiting Chicago and talking at PUG!

Leave a comment

Filed under events, People

The joy of team work!

Last week I had to work a lot! I mean, A LOT! But work-wise that was one of the happiest weeks, and I wanted to share with everybody why it felt so happy.

Those of you who worked with me before know, that my favorite activity is working with  applications, because optimizing for applications is way more difficult and way more fun than optimizing reports. Yes, you can impress everybody, including yourself, reducing the report run time from one hour to one minute. But how much cooler is it to reduce the page load time from 30 sec to 0.1 sec?!  Especially when you have a power not only to write the best queries ever, but also to design the data model “the right way”.

When you do the application database work, the most critical part is to work in close contact with the  application developers. And depending on what kind of team you are lucky or unlucky to have, it may be the best or the worst part of your professional experience.

My IT team here in Braviant is one of the best I ever worked with, which was proved one more time last week. The most difficult part has always been connecting the db work and the app work, like: I’ve selected all this data for you, can you read it from the output I am providing? Or: we can give you all input parameters that way, can you process them? Our app developers have already made a huge step “in my direction” agreeing not to use ORM, but to read the output of the database function. Next step – we hit the  wall exactly where I expected. I’ve spent a half of Saturday writing my code, so that the app developers could start using in Monday morning… and now they are saying they can’t process correctly the embedded record sets! I’ve heard that many times before, and each time in a couple of hours I would hear: Hettie, there is no way! Let’s do it “the old way”, we know how… That time, however my team kept trying to find a solution, and watching these efforts made me to start thinking how I can change the output on my side. After several iterations going back and forward, we came up with a pretty neat way to return the records, which could be used right away, and even a better approach, which, however would require more work from me, and could not be done on the spot.

And you know – I totally understand, when people hate rewriting one piece of code multiple times, which makes me appreciate even more the willingness to rewrite later, when I will come up with the more automated solution from my side…

I have already written a lot in this post, and I am not sure whether it all make sense, but let me try to summarize.  I loved that everybody were willing to compromise, to make adjustments, that there were no “just because” statements, that the whole team was focused on the goal to build the application right from the very beginning, so that we won’t need to worry about performance six months down the road.

Hope it will continue that way.  Except of me working on Saturdays part 🙂

 

Leave a comment

Filed under People, Systems, Team and teamwork

Please join us this Thursday for a very special meetup!

Attention Chicago Postgres users, developers, DBAs and everybody who knows what the word “Postgres” means! In the unlikely event you did not hear about it already – this day is coming! Bruce Momjian will be our guest at the July meeting of the Chicago PostgreSQL User Group, and I do not think I need to say anything else! We are just excited that this is finally happening!

Please RSVP, if you are planning to come, and didn’t RSVP yet – we have a new person at the building reception, and I need to give her a guest list! Also, just for this meetup we will extend the time till 9 PM, so that everybody could enjoy the conversation.

Hope to see you there!

Leave a comment

Filed under events, People, talks

Chicago PUG meetup with Joe Conway

Yesterday was a Day – a day when Joe Conway presented at Chicago PUG. He was talking about the PL/R extension of Postgres, which is really important for out data analysts.

We had a full house:

And everybody were listening to the great presentation:

Continue reading

Leave a comment

Filed under events, People, SQL, talks

May PUG with Joe Conway!

I neglected to advertise our May event, and this is going to be indeed the most interesting meetup of 2017! Because just in two days, on May 19 Joe Conway will be speaking at Chicago PUG.

I definitely do not need to advertise him, but I am advertising the fact of his appearance in Chicago, and encourage everybody to attend.

Please RSVP at our Meetup page, and hope to see there.

Leave a comment

Filed under events, People, Uncategorized

The Data science education panel on ICDE 2017

In order to keep up with my own promises to tell more about what was happening on ICDE 2017 I am going to write about the panel on data science education. The panel was called “Data Science Education: We’re Missing the Boat, Again”, and I’d say it was probably the most interesting panel I’ve ever attended! By the time the panel was about to start, there was a huge crowd, and people were encouraged to take a dozen of remaining seats in the first and second rows (do I need to mention that I was at the front five minutes before the panel started?)

The topic of the panel described in my own words was the following. The Data science is a buzz word, students want to be taught “data science”, and there is a common believe that data science is about machine learning and statistical modeling while in reality 80% of time of the data scientists is spent on data pre-processing, cleansing, etc.

The panelists were given the questions which I am copying below.

If data scientists are spending 80% of their time grappling with data, what are they doing wrong? What are we doing wrong? What can we teach them to reduce this cost?
• What should a practicing data scientist learn about sys- tems engineering? What’s the difference between a data engineer and a data scientist?
• Scale is at the heart of what we do, and it’s a daily source of friction for data scientists. How can we teach funda- mental principles of scalability (randomized algorithms, for example) in the context of data systems?
• Perhaps data scientists are just consumers of our technol- ogy — how much do they really need to know about how things work? Empirically, it appears to be more than we think. There is a black art to making our systems sing and dance at scale, even though we like to pretend everything happens automatically. How can we stop pretending and start teaching the black art in a principled way?
• How can we address emerging issues in reproducibility, provenance, curation in a principled yet practical way as a core part of data engineering and data systems? Consider that the ML community has a vibrant workshop on fairness, accountability, and transparency. These topics are at least as relevant from a database perspective as they are from an ML perspective, maybe more so. Can we incorporate these issues into what we teach?
• How much math do we need to teach in our database- oriented data science courses? How can we expose the underlying rigor while remaining practical for people seeking professional degrees?

Bill Howe from UW was a moderator and the first panelist to give his talk.

The second one was Jeff Ullman, and thereby I have nothing more to say:)

Actually, i really liked the fact that he mentioned, that the math courses, linear algebra and calculus should be included into the Database curriculum.  I was always saying that nobody without Calc  BC should be allowed anywhere near any database.

The next panelist was Laura Haas, and again – what else I need to say, except of I’ve enjoyed each and every moment of her presentation?

One thing from her presentation which I find really important is that the Data science is not a part of the Computer Science, and not a part of Database management.  As Laura put it, “we provide the tools”, but not like “we” should teach the DS as a part of CS.

Next panelist was Mike Franklin from UC, and I hope this picture is clear enough for you to see a funny example of DS he is showing.

And the last one was very controversial Tim Kraska from Brown, who started with “he is going to disagree with all the rest of panelists” – and he did.

To be honest, it’s very difficult to write about this panel, because each of you can google all these great people, but you would need to see a video recording of this panel to really fell how interesting, and how much fun it was.

After the panel I talked to several conference participants, who like me are from industry and asked them what are they looking for when hiring recent grads. And literally everybody said the same thing that I was thinking about: they said they hire smart people with solid basic education, people who can solve problems, “and we will teach them all the rest”. Which I couldn’t agree more!

Paradoxically, the students think it’s cool to have something about “Data science” in their curriculum, they often think it will make them more marketable, but real future employers do not care that much!

Leave a comment

Filed under Data management, events, People, publications and discussions, talks

ICDE 2017 – Laura Haas’ keynote talk

I’ve missed the first keynote of the conference, but there was no way I could miss the second one – Laura Haas’ “Leveraging data and people to accelerate data science”.

Here is a reference for the Accelerated Discovery Lab in IBM, where you can find lots of information about different projects. The keynote talk highlighted the project related to food contamination.

Below are several pictures from the presentation.

Continue reading

Leave a comment

Filed under events, People, talks