Back in 2017, Dimitri Fontaine asked me about an interview for his upcoming book. Now there is a second edition, “The Art of PostgreSQL” (affiliate link). We thought take this opportunity to publish this interview here.

An Interview with Markus Winand

Markus Winand is the author of the very famous book “SQL Performance explained” and he also provides both http://use-the-index-luke.com and http://modern-sql.com. Markus is a master of the SQL standard and he is a wizard in terms of how to use SQL to enable fast application delivery and solid run-time performances!

Developers often say that SQL is hard to master. Do you agree? What would be your recommendations for them to improve their SQL skills?

I think the reason many people find SQL hard to learn is that it is a declarative programming language.

Most people first learn imperative programming: they put a number of instructions into a particular order so that their execution delivers the desired result. An SQL statement is different because it simply defines the result. This becomes most obvious in the select clause, which literally defines the columns of the result. Most of the other main clauses describe which rows should be present in the result. It is important to understand that the author of an SQL statement does not instruct the database how to run the query. That’s up to the database to figure out.

So I think the most important step in mastering SQL is to stop thinking in imperative terms. One recurring example I’ve seen in the field is how people imagine that joins work and more specifically, which indexes can help in improving join performance. People constantly try to apply their knowledge about algorithms to SQL statements, without knowing which algorithm the database actually uses. This causes a lot of problems, confusion and frustration.

First, always focus on writing a clear statement to describe each column and row of the desired result. If needed, you can take care of performance afterwards. This however, requires some understanding of database internals.

What would you say is the ideal SQL wizardry level a developer should reach to be able to do their job correctly?

Knowing everything would be best, I guess ;)

In reality, hardly any programmer is just an SQL programmer. Most are Java, C#, PHP, or whatever programmers who — more or less frequently — use SQL to interact with a database. Obviously, not all of them need to be SQL experts.

Today’s programming often boils down to choosing the right tool for each problem. To do this job correctly, as you properly phrased it, programmers should at least know what their SQL database could do. Once you remember that SQL can do aggregations without group by—e.g. for running totals, moving averages, etc.—it’s easy to search the Internet for the syntax. So I’d say every programmer (and even more so architects) should have a good overview of what SQL can do nowadays in order to recognize situations in which SQL offers the best solution.

Quite often, a few lines of SQL can replace dozens of lines of an imperative program. Most of the time, the SQL solution is more correct and even faster. In the vein of an old saying about shell scripts, I’d say: “Watch out or I’ll replace a day’s worth of your imperative programming with a very small SQL statement”.

You know the detailed behavior of many different RDBMS engines and you are used to working with them. Would you write portable SQL code in applications or pick one engine and then use it to its full capacity, writing tailored SQL (both schema and queries)?

I first aim to use standard SQL. This is just because I know standard SQL best and I believe that the semantics of standard SQL have the most rigid definitions. That means standard SQL defines a meaningful behavior, even for the most obscure corner cases. Vendor extensions have a tendency to focus on the main cases. For corner cases, they might behave in surprising and inconsistent ways — just because nobody thought about that during specification.

Sometimes, I cannot solve a problem with standard SQL — at least not in a sufficiently elegant and efficient way. That is more often because the database at hand doesn’t support the standard features that I’d like to use for this problem. However, sometimes the standard just doesn’t provide the required functionality. In either case I’m also happy to use a vendor extension. For me, this is really just my personal order of preference for solving a problem — it is not a limitation in any way.

When it comes to the benefits of writing portable SQL, there seems to be a common misconception in the field. Quite often, people argue that they don’t need portability because they will never use another database. And I actually agree with that argument in the sense that aiming for full portability does not make any sense if you don’t need to run the software on different database right now.

On the other hand, I believe that portability is not only about the code — it is also about the people. I’d say it is even more about the people. If you use standard SQL by default and only revert to proprietary syntax if needed, the SQL statements will be easier for other people to understand, especially people used to another database. On the scale of the whole industry it means that bringing new personnel on board involves less friction. Even from the personal viewpoint of a single developer, it has a big benefit: if you are used to writing standard SQL then the chances increase that you can write SQL that works on many databases. This makes you more valuable in the job market.

However, there is one big exception and that’s DDL – i.e. create statements. For DDL, I don’t even aim for portability in the first place. This is pointless and too restricting. If you need to create tables, views, indexes, and the like for different databases, it is better to just maintain a separate schema definition for each of them.

On my Own Behalf I make my living from training, other SQL related services and selling my book. Learn more at https://winand.at/.

How do you see PostgreSQL in the RDBMS offering?

PostgreSQL is in a very strong position. I keep on saying that from a developer’s perspective, PostgreSQL’s feature set is closer to that of a commercial database than to that of the open-source competitors such as MySQL/MariaDB.

I particularly like the rich standard SQL support PostgreSQL has: that means simple things like the fully featured values clause, but also with [recursive] , over , lateral and arrays.

Affiliate Links

When buying Dimitris book through links in this article, I’ll get a revenue share from Dimitri. Here is such a link.