Speaker Interview

Tomas Vondra

CREATE STATISTICS - what is it about?
15:55–16:40
Auditorium

Could you briefly introduce yourself?: My name is Tomas Vondra, I do work for 2ndQuadrant, I live in Prague and I’m a PostgreSQL engineer and developer, long-term contributor, and now also a committer. I do focus mainly on topics related to planning and performance, but I have a lot of side quests too.
How do you engage with the PostgreSQL community?: In a lot of ways — for me the diverse, active and growing community is what makes a difference compared to proprietary products, so I try to help with building it. Aside from the obvious development related stuff (writing patches, discussing them on a mailing list or in person, maybe even committing some of them) I do attend conferences and meetups, give talks and trainigs. For the last 10 years or so I co-organize the Prague conference, and I’m the president of the local PUG.
Have you enjoyed previous pgDay Paris or other PostgreSQL Europe conferences, either as attendee or as speaker?: I’ve not attended pgDay Paris before (although I’ve been at the 2009 european PostgreSQL conference, which was in Paris), but I’m sure it’ll be great. I’ve attended a number of PostgreSQL conferences, both as a speaker as an attendee, and I think I’ve enjoyed almost all of them.
What will your talk be about, exactly? Why this topic?: My talk will be about CREATE STATISTICS, a feature initially introduced in PostgreSQL 10, which gives the optimizer better information about correlation between columns. The ultimate goal is producing better query plans and increasing the performance for users, which I think is pretty valuable. It’s also an interesting research topic.
What is the audience for your talk?: I think it’s aimed at about the same audience as topics about indexing, for example — a mix of developers (those building applications on top of PostgreSQL) and DBAs. Designing the indexes is part of the development, so the developers need to understand how indexes work and when to use them. And DBAs may use them to solve unexpected problems. It’s about the same for CREATE STATISTICS. So if you’re a developer, or a DBA tasked with monitoring a system for issues, this might be useful for you.
What existing knowledge should the attendee have in order to follow your talk?: I tried to build the talk to make it accessible for people without a lot of prior knowledge. Basic knowledge of probability theory, understanding what EXPLAIN ANALYZE does and experience with tuning slow queries should be enough to understand the talk.
Which missing feature would you most like to see in PostgreSQL?: That’s a hard question, not sure there’s one single feature I can name. But considering my talk is about statistics used by optimizer, I’d like to see improvements in this area — improving join estimates, estimates for expressions, and also using information from past executions (aka adaptive estimation).
Thank you!