Some arguments against small syntax extensions in GHC

January 22, 2020 - Tagged as: en, haskell, ghc.

I recently realized that I haven’t published a single post in 2019. I think that’s the longest break I ever took to blogging, and it kinda made me motivated to publish some of the draft posts that I’ve been keeping in private Github gists.

This post is originally written in 11 January 2019. Because it is more of an angry rant than a constructive piece, I wasn’t sure at the time that publishing it is a good idea. However reading it again now, I see that it’s not directed at a person, a group, or a specific proposal/patch, so I think it shouldn’t be offensive to anyone and I should be able to publish it on my personal blog.

(original post starts below)

So I woke up at 5AM today and felt like writing about one of my frustrations. These are my personal opinions, and I don’t represent GHC HQ here.

At this point adding new syntax to GHC/Haskell is a bad idea. Before moving on to examples, here are some facts:

The language that GHC supports is incredibly complex. GHC 8.6.3 man page lists 115 language pragmas.

You just can’t have a good understanding of all of these features and know interactions of the proposed syntax with all combinations of these.

GHC is a complex and old compiler with parts that today no active contributor knows well. The compiler (ignoring all the libraries, the RTS, tools etc.) currently has 189,699 lines of code (ignoring comments and whitespace). That’s a lot of complexity to deal with.

When you propose a new syntax, what you’re actually proposing is: At least one more pragma More user manual sections MVP implementation of your syntax (which is usually not bug-free) A few common-case tests (which are usually not enough) More headache for tool developers Scaring more potential new Haskell developers away Adding to the frustration of existing Haskell developers Adding maintenance burden to GHC devs

Because you can’t predict all the interactions of your new syntax (conceptually, or in the implementation) your syntax will cause a ton of problems.

Those problems will sit there unfixed for months/years.

GHC maintainers barely have enough time and manpower to provide stable releases. 8.6.1 and 8.6.2 are completely broken (#15544, #15696, #15892), and 8.6.3 doesn’t work well on Windows.

You might not accept some of these, however in my experience these are facts. If you disagree with any of these let me know and I can elaborate.

I’ll have only two examples for now, because I don’t normally work on front-end parts of the compiler I don’t notice most of the problems.

Example 1: Tiny addition to GHCi syntax

#7253 proposed a tiny new syntax in GHCi. A few years later a new contributor picked it up and submitted a patch. This trivial new syntax later caused #11606, #12091, #15721. That’s 3 too many tickets for a trivial syntax that buys us so little. It also generated at least one SO question, and invalidated an answer to another SO question by making things more complicated.

The implementation is finally fixed by a frustrated maintainer, but the additional complexity (both in the implementation, and as the GHCi syntax to be explained to users) it added won’t be fixed.

Example 2: -XBlockArguments

This was proposed as a GHC proposal. It’s a trivial syntax change that in the best case can save 3 characters (including spaces). So far it generated two tickets: #16137, #16097. Even worse than the previous example is none of these tickets mention -XBlockArguments , they don’t even use it! Yet the error messages got significantly worse because of it.

Just to be clear

I think some of the extensions are quite useful. However I also think that at this point new syntax extensions are doing more harm than good. Problems from a maintainer’s point of view are as listed above (arguably maintainers’ problems are also users’ problems because they lead to poor product, but let’s ignore this aspect for now). Now I want to add one more problem, this time from a software developer/engineer’s point of view:

Adding a different way of doing things, especially when the difference is so small, does more harm than good.

Here’s why. Now that we have two ways of using do syntax:

-- (1) atomically $ do ... -- (2) with -XBlockArguments atomically do ...

with my team I have to do one of these

Decide which one to use, and somehow manually make sure to use it consistently (this can’t be done automatically as we lack the tooling) Let everyone use whatever they want.

(1) means wasting the team’s time and energy on endless bikeshedding. (2) means being inconsistent in the source code. Either way we lose.

You might argue that with good tooling (1) is not a problem, and I’d agree. However as we add new syntax the tooling story will only get worse. GHC Haskell syntax is already so complex we don’t even have a good formatter. We should first stop making it even more complex if we want the tooling story to get better.

What we need

In my opinion what we need is principles to guide the language and the compiler. Currently we don’t have this (last paragraph), and the result is 100+ pragmas, a buggy compiler, and frustrated users and maintainers.

My advice to users

If you’re proposing a new syntax; don’t! If you know someone who will, point them to this blog post.