PG Open 2017

It will only happen in September, but I wanted to give all my friends an advanced notice, that I will be there! My talk was accepted, and I’ve confirmed that I will be able to present, so it’s all official.

Please visit the PG Open website, and if you are in the area and want to meet – I am definitely staying for Saturday after the conference!

Not exactly my Alma Meter but still!

The champions are from my hometown, and my actual Alma Mater is among the winners!

ACM Bulletin Archives
May 25, 2017
Russian Team Takes World Champion Title in ACM ICPC Programming Contest

to view image click on Three students from St. Petersburg University of IT, Mechanics and Optics (ITMO) earned the title of 2017 World Champions in the ACM International Collegiate Programming Contest (ICPC). Teams from University of Warsaw, Seoul National University and St. Petersburg State University finished the competition in second, third and fourth places and were recognized with gold medals in the prestigious competition, which ended today in Rapid City, South Dakota.

ACM-ICPC is the premier global programming competition conducted by and for the world’s universities. It is conceived, operated and shepherded by ACM, sponsored by IBM, and headquartered at Baylor University. For more than four decades, the competition has raised the aspirations and performance of generations of the world’s problem solvers in computing sciences and engineering.

At ICPC, teams of three students tackle eight or more complex, real-world problems. The students are given a problem statement, and must create a solution within a looming five-hour time limit. The team that solves the most problems in the fewest attempts in the least cumulative time is declared the winner, with the top 12 teams receiving medals.

ICPC Regional participation included 46,381 students and faculty in computing disciplines from 2,948 universities in 103 countries on six continents. A record 50,145 students and 5,073 coaches competed in ICPC and ICPC-assisted competitions this year.

“As computing increasingly becomes part of the daily routines of a growing percentage of the global population, the solution to many of tomorrow’s challenges will be written with computing code,” said ACM President Vicki L. Hanson. “The ICPC serves as a unique forum for tomorrow’scomputing professionals to showcase their skills, learn new proficiencies and to work together to solve many real-world problems. This international event fosters the innovative spirit that continues to transform our world.”

Full results of the competition are available here.

Read the news release.

How women networking should NOT be organized

There was one small episode during ICDE 2017, and although it has been a month already, I still feel like I want to write about. Here is want happened

Among other booths of different vendors there was (as usual) the Amazon AWS. And one of their reps told me,that on Thursday they are going to have a “women event”, and whether I want to sign up, and if I just could leave my email with them. I told her: well, there is a conference banquet on Thursday, at what time precisely your event is going to be? And she said reassuringly: after the banquet!

Now, the banquet would start at 6PM, and on Wednesday evening I receive the following email:

Hi Hettie,
I wanted to reach out on behalf of AWS and invite you to attend the AWS Women in Engineering Networking Event tomorrow on Thursday, April 20. Our recruitment and engineering teams are coming down from Seattle for the ICDE Conference and we’d love to meet you in-person at our happy hour at Blue Door Winery in San Diego (around 3 miles from the conference venue).
There will be wine tasting, artisanal bites, and a raffle on-site. Please feel free to bring guests, the more the merrier!

I am clicking on the invite, and guess what start time it shows? Yes, you are right – at 6PM.

Let me tell you that. The banquet is the most important social event at any conference, and I would always make a point for the younger generation about the importance of attending a conference banquet. There you can be introduced or just introduce yourself to anybody, you can talk at length with the authors of the papers which were most interesting for you. People just are more relaxed and do not run to attend the next session. And if somebody organized a “women networking event” at the same time – how this should be perceived? Like “kid’s table”?! How much this kind of networking would worth? And if the event organizers didn’t bother to look at the conference program when scheduling this event, it’s even worse…

Fortunately, at least at the first glance, there was not that many women who would trade the banquet for this networking event 🙂

Chicago PUG meetup with Joe Conway

Yesterday was a Day – a day when Joe Conway presented at Chicago PUG. He was talking about the PL/R extension of Postgres, which is really important for out data analysts.

We had a full house:

And everybody were listening to the great presentation:

May PUG with Joe Conway!

I neglected to advertise our May event, and this is going to be indeed the most interesting meetup of 2017! Because just in two days, on May 19 Joe Conway will be speaking at Chicago PUG.

I definitely do not need to advertise him, but I am advertising the fact of his appearance in Chicago, and encourage everybody to attend.

Please RSVP at our Meetup page, and hope to see there.

The Data science education panel on ICDE 2017

In order to keep up with my own promises to tell more about what was happening on ICDE 2017 I am going to write about the panel on data science education. The panel was called “Data Science Education: We’re Missing the Boat, Again”, and I’d say it was probably the most interesting panel I’ve ever attended! By the time the panel was about to start, there was a huge crowd, and people were encouraged to take a dozen of remaining seats in the first and second rows (do I need to mention that I was at the front five minutes before the panel started?)

The topic of the panel described in my own words was the following. The Data science is a buzz word, students want to be taught “data science”, and there is a common believe that data science is about machine learning and statistical modeling while in reality 80% of time of the data scientists is spent on data pre-processing, cleansing, etc.

The panelists were given the questions which I am copying below.

If data scientists are spending 80% of their time grappling with data, what are they doing wrong? What are we doing wrong? What can we teach them to reduce this cost?
• What should a practicing data scientist learn about sys- tems engineering? What’s the difference between a data engineer and a data scientist?
• Scale is at the heart of what we do, and it’s a daily source of friction for data scientists. How can we teach funda- mental principles of scalability (randomized algorithms, for example) in the context of data systems?
• Perhaps data scientists are just consumers of our technol- ogy — how much do they really need to know about how things work? Empirically, it appears to be more than we think. There is a black art to making our systems sing and dance at scale, even though we like to pretend everything happens automatically. How can we stop pretending and start teaching the black art in a principled way?
• How can we address emerging issues in reproducibility, provenance, curation in a principled yet practical way as a core part of data engineering and data systems? Consider that the ML community has a vibrant workshop on fairness, accountability, and transparency. These topics are at least as relevant from a database perspective as they are from an ML perspective, maybe more so. Can we incorporate these issues into what we teach?
• How much math do we need to teach in our database- oriented data science courses? How can we expose the underlying rigor while remaining practical for people seeking professional degrees?

Bill Howe from UW was a moderator and the first panelist to give his talk.

The second one was Jeff Ullman, and thereby I have nothing more to say:)

Actually, i really liked the fact that he mentioned, that the math courses, linear algebra and calculus should be included into the Database curriculum.  I was always saying that nobody without Calc  BC should be allowed anywhere near any database.

The next panelist was Laura Haas, and again – what else I need to say, except of I’ve enjoyed each and every moment of her presentation?

One thing from her presentation which I find really important is that the Data science is not a part of the Computer Science, and not a part of Database management.  As Laura put it, “we provide the tools”, but not like “we” should teach the DS as a part of CS.

Next panelist was Mike Franklin from UC, and I hope this picture is clear enough for you to see a funny example of DS he is showing.

And the last one was very controversial Tim Kraska from Brown, who started with “he is going to disagree with all the rest of panelists” – and he did.

To be honest, it’s very difficult to write about this panel, because each of you can google all these great people, but you would need to see a video recording of this panel to really fell how interesting, and how much fun it was.

After the panel I talked to several conference participants, who like me are from industry and asked them what are they looking for when hiring recent grads. And literally everybody said the same thing that I was thinking about: they said they hire smart people with solid basic education, people who can solve problems, “and we will teach them all the rest”. Which I couldn’t agree more!

Paradoxically, the students think it’s cool to have something about “Data science” in their curriculum, they often think it will make them more marketable, but real future employers do not care that much!

ICDE 2017 – Laura Haas’ keynote talk

I’ve missed the first keynote of the conference, but there was no way I could miss the second one – Laura Haas’ “Leveraging data and people to accelerate data science”.

Here is a reference for the Accelerated Discovery Lab in IBM, where you can find lots of information about different projects. The keynote talk highlighted the project related to food contamination.

Below are several pictures from the presentation.

