Monthly Archives: April 2017

The Data science education panel on ICDE 2017

In order to keep up with my own promises to tell more about what was happening on ICDE 2017 I am going to write about the panel on data science education. The panel was called “Data Science Education: We’re Missing the Boat, Again”, and I’d say it was probably the most interesting panel I’ve ever attended! By the time the panel was about to start, there was a huge crowd, and people were encouraged to take a dozen of remaining seats in the first and second rows (do I need to mention that I was at the front five minutes before the panel started?)

The topic of the panel described in my own words was the following. The Data science is a buzz word, students want to be taught “data science”, and there is a common believe that data science is about machine learning and statistical modeling while in reality 80% of time of the data scientists is spent on data pre-processing, cleansing, etc.

The panelists were given the questions which I am copying below.

If data scientists are spending 80% of their time grappling with data, what are they doing wrong? What are we doing wrong? What can we teach them to reduce this cost?
• What should a practicing data scientist learn about sys- tems engineering? What’s the difference between a data engineer and a data scientist?
• Scale is at the heart of what we do, and it’s a daily source of friction for data scientists. How can we teach funda- mental principles of scalability (randomized algorithms, for example) in the context of data systems?
• Perhaps data scientists are just consumers of our technol- ogy — how much do they really need to know about how things work? Empirically, it appears to be more than we think. There is a black art to making our systems sing and dance at scale, even though we like to pretend everything happens automatically. How can we stop pretending and start teaching the black art in a principled way?
• How can we address emerging issues in reproducibility, provenance, curation in a principled yet practical way as a core part of data engineering and data systems? Consider that the ML community has a vibrant workshop on fairness, accountability, and transparency. These topics are at least as relevant from a database perspective as they are from an ML perspective, maybe more so. Can we incorporate these issues into what we teach?
• How much math do we need to teach in our database- oriented data science courses? How can we expose the underlying rigor while remaining practical for people seeking professional degrees?

Bill Howe from UW was a moderator and the first panelist to give his talk.

The second one was Jeff Ullman, and thereby I have nothing more to say:)

Actually, i really liked the fact that he mentioned, that the math courses, linear algebra and calculus should be included into the Database curriculum.  I was always saying that nobody without Calc  BC should be allowed anywhere near any database.

The next panelist was Laura Haas, and again – what else I need to say, except of I’ve enjoyed each and every moment of her presentation?

One thing from her presentation which I find really important is that the Data science is not a part of the Computer Science, and not a part of Database management.  As Laura put it, “we provide the tools”, but not like “we” should teach the DS as a part of CS.

Next panelist was Mike Franklin from UC, and I hope this picture is clear enough for you to see a funny example of DS he is showing.

And the last one was very controversial Tim Kraska from Brown, who started with “he is going to disagree with all the rest of panelists” – and he did.

To be honest, it’s very difficult to write about this panel, because each of you can google all these great people, but you would need to see a video recording of this panel to really fell how interesting, and how much fun it was.

After the panel I talked to several conference participants, who like me are from industry and asked them what are they looking for when hiring recent grads. And literally everybody said the same thing that I was thinking about: they said they hire smart people with solid basic education, people who can solve problems, “and we will teach them all the rest”. Which I couldn’t agree more!

Paradoxically, the students think it’s cool to have something about “Data science” in their curriculum, they often think it will make them more marketable, but real future employers do not care that much!

Leave a comment

Filed under Data management, events, People, publications and discussions, talks

ICDE 2017 – Laura Haas’ keynote talk

I’ve missed the first keynote of the conference, but there was no way I could miss the second one – Laura Haas’ “Leveraging data and people to accelerate data science”.

Here is a reference for the Accelerated Discovery Lab in IBM, where you can find lots of information about different projects. The keynote talk highlighted the project related to food contamination.

Below are several pictures from the presentation.

Continue reading

Leave a comment

Filed under events, People, talks

ICDE 2017 – Day 3

This will be again more a note to myself to write in more details about what I’ve learned at ICDE 2017.

I didn’t stay the whole Day 3, but I made sure to pay for the TSA pre-check and use the fact that the conference venue was so close to the airport.  The main events of Day 3 were:

  • The keynote by Pavel Pevzner about the “New revolution” in online education.  I can’t say I liked it, because I disagree with a lot of what was said, but I it was something which would make you think
  • The Industry 2 session, which was to be honest less interesting than Industry 1, although quite educational.  The last presentation made me think again that the way we use the FDW for populating our Data Mart is something not convetional, and probably should   be publicized more.

During the conference people were asking me what y company is doing, and I’ve realized that our data modeling and predictive analytics (which I do not know much about) were of the most interest. Also, I am always saying the “we do not have any big data”, but now, seen what other people consider being “big data” I am starting to think that may be we have :).

Overall I am very excited about what I’ve learned, about the people I’ve met, adn I want to reinvent my life again, and to do all those great things…. and to submit a paper to ICDE 2018, of cause :).

Leave a comment

Filed under events, People, research

ICDE 2017 in San Diego – what was happening on Day 2

Again very briefly:

  • An absolutely brilliant keynote by Laura Haas from IBM . The talk about Big Data, how to make sense out of it, and what was most interesting for me – about the human factor in dealing with the Big Data.
  • The IEEE Awards presentations, I especially liked the one by Susan Davidson from the University of Pennsylvania about data citation (will definitely write more about this presentation)
  • The panel on Data Science education – probably the most interesting panel I was ever present at 🙂
  • The next ICDE was announced, and it’s going to be in Paris, which is super cool, but the deadline is way earlier than I’ve expected… so it will be a challenge…

One more keynote, one more Industry session, and I will be on my way back to Chicago

Leave a comment

Filed under events, news

ICDE 2017 in San Diego – Day 2

I met with my favorite textbook on the Database Theory!

Leave a comment

Filed under events, People

ICDE 2017 in San Diego – Day 1

I am in San Diego now, attending  ICDE 2017. As always, I will write in more details about most interesting presentations later, but for now I just wanted to say, that both Demo and Industry sessions I’ve attended yesterday, were really great.

Here is the link for Demo 1 and here is the link for the Industry 1 session.  I can’t even say which one I liked more – they were all incredibly interesting. Karthik, I know you will be reading this post today:), so here is for you. I’ve told you earlier that I like a lot how this paper was written: very clear description of the problem and your solution, making it interesting even for those who hear about it for the first time. Now I wanted to tell you, that the presentation was also great: very clear and articulate, not trying to squeeze in more information that can fit into 15 minutes, but at the same time highlighting all important points. Just perfect! (Although I am still upset you are not here :)). So here is a picture for you to be jealous – that’s where we have lunches!

2 Comments

Filed under events, research

My first year at Braviant

Yesterday was my first anniversary at  Braviant Holdings. That was not an easy year. There was a lot of hard work. Lots of days and nights when I was not sure where I will be able to accomplish what I want and what I believe is needed to be accomplished.

But now, when I look back and think about what have happened over this year, I can only say: Wow!

Building of the new Data Mart from scratch, completely replacing the new system, using new techniques for combining multiple external sources. Keeping up with new challenges. Helping to build new framework for our data analytics. Starting as a sixth employee in the company and the only tech person and now being a part of a tech team within the company which quadrupled it’s size in one year.

And most importantly – delivering high- quality database solutions. The thing  which makes me really happy is that through this whole year I never had to compromise my technical values, that I was given a freedom and responsibility to do what I believe is right, and to be accountable for results. It’s the best thing one can imagine – to see how your works makes your company to perform better – every day.

And the last, but not the least – always feeling good being around my co-workers, smart, intelligent, helping, compassionate and dedicated. I love the culture of continuos learning, which exists at our workplace, love the fact that everybody wants to know what other groups are doing, and how their work impacts others. Granted its’ much easier to accomplish when the company is small, but I really hope we’ll continue this way.

Leave a comment

Filed under Companies, People, Team and teamwork