I came across a Haskell question which highlighted a common confusion. I'm going to explain how to get around that! The person who asked the question wants to create some sort of command-line interface in Haskell, and was wondering how to structure their code. In particular, they were curious about how to represent, in the code, the commands the interface should support.

Static Set of Commands: Sum Type

Let's get the obvious out of the way first. You could create a sum type along the lines of

data Command = Echo String | Sort String | Help

which makes it easy to write functions that parse and execute commands. A parse function would take two strings (command name and arguments) and return a value of this Command type, depending on which command name was written.

parse :: String -> String -> Command parse cmdName args = case cmdName of "echo" -> Echo args "sort" -> Sort args "help" -> Help

The exec function works essentially in the reverse. For the sake of this toy example, commands are going to result in simple String s; we could imagine that they instead return an IO Result which is a custom type that is allowed to have side effects. The exec function would then return a String when given a command. Like so:

exec :: Command -> String exec cmd = case cmd of Echo args -> args Sort args -> unwords (sort (words args)) Help -> "Try typing an actual command, dummy!"

This is a perfectly okay approach, but it has one drawback, which the person who asked the question also realised: when a programmer wants to extend your CLI by adding another command to it, they have to change the existing parsing and execution functions! It's a big effort and many places to remember to change just because you add a simple command.

Faulty Approach: Typeclasses

At this point, a beginner may faintly remember something about typeclasses being used when you have several different types which share some trait. So the beginner – and indeed the person who asked this question – will start writing the following skeleton code.

A command is something you can execute and get back a String .

class Command something where exec :: something -> String

Then we have a few different commands:

data Echo = Echo String instance Command Echo where exec (Echo arguments) = arguments data Sort = Sort String instance Command Sort where exec (Sort arguments) = unwords (sort (words arguments)) data Help = Help instance Command Help where exec Help = "Try typing an actual command, dummy!"

So far, this looks great! We have split up the exec function such that the user can define their own commands in separate modules. These user-defined exec functions will then automatically be called from our CLI main loop.

But... what about the parse function? Trying to figure out its type signature is hard enough already. If we're used to object-oriented programming, we might want to say,

parse :: Command cmd => String -> String -> Maybe cmd

We might think this means, "Given two strings, the parse function will maybe return something that is a command. We don't know what specific command type it will be, because that's determined by the strings passed into the function."

That is false.

In Haskell, you must know what type something has statically. You can't say "its type depends on some run-time value". (You can with various GHC extensions, but not in standard Haskell.)

What that type signature actually means is, "If this function is given two strings and asked for any command type, it must obey!"

In other words, the function does not get to decide what type its return value should have. The caller of the function decides that, and the function must work for all possible command types.

Remember that typeclass constraints on functions mean "for all", not "there exists".

Another way to phrase this is: typeclasses will not let you do dynamic dispatch. When the program is finally compiled, all types should be known. Typeclasses are an entirely static business.

Dynamic Dispatch in Haskell

The good news is that we don't need any fancy advanced language features to get what we want in Haskell. In fact, we need to go back to basics. We need to start programming with values, functions and laziness.

A Command is something that has a name and the odd property that when you supply arguments, you get an Invocation of the command. A Command can exist without arguments, though; ifconfig exists as an abstract entity outside of a specific invocation of it. You can talk about "the ifconfig command" without having to talk about a specific time you called the ifconfig command with a specific set of arguments. However, once you have arguments, you are also able to talk about the specific invocation.)

Phrased in Haskell, we say that a Command value has two properties, or fields:

The name property, which is a plain String ; and

property, which is a plain ; and That odd property that, once supplied with some arguments, you can talk about a specific invocation of the command. This property has type String -> Invocation .

data Command = Command { name :: String, , with :: String -> Invocation }

So what is an Invocation ? It is an invocation of a command, it has some arguments, and it results in a String when executed (remember that the result type could be anything more powerful if you want, like IO Result ).

data Invocation = Invocation { of :: Command, , args :: String, , exec :: String }

Assuming commands are of these types, we can write a parser that takes a set of legal commands, and maybe returns an invocation of one of these commands. This parse function simply searches through the list of legal commands and looks for one where the name matches the input name. Then it runs that command with arguments to get an invocation.

parse :: [Command] -> String -> String -> Maybe Invocation parse legalCommands cmdName arguments = fmap (`with` arguments) $ find (\cmd -> name cmd == cmdName) legalCommands

This is all well and good, but how are these commands defined? As regular Haskell values! Here's an example:

sortingCommand = Command "sort" $ \args -> Invocation sortingCommand args $ unwords (sort (words args))

The sortingCommand variable contains a Command with the name "sort" . Its with field contains a function that, when given some arguments, returns an Invocation of the sortingCommand with the supplied arguments, and when you run exec on that invocation, you get the arguments sorted.

It might look to you almost like we're "hard-coding" the result of the function at "invocation time", and not at execution time, because the exec function has become a plain String field in the Invocation value. However, because of laziness and purity, that's not the case. We could put anything in that field (including an IO action!) and it would not be executed until we actually call the exec field on the value.

Note also that the Invocation of the sorting command refers to the sorting command itself. Laziness. No problem. And if you think about it, why should that be a problem? Of course the invocation of a command should be able to refer to the command it's an invocation of!

However, we still have one problem. We said the parse function takes a list of legal commands, which means that if we use the parse function in our main loop, we will have hard-coded a list of legal commands somewhere, and the user will have to modify our code to add a command anyway. Bummer.

But of course, what do you do when you want to avoid hard-coding something? Change it to a function parameter! Throw out your main loop. Let your CLI processing function take a list of legal commands as a parameter, and let the user write the main function, which calls out to your processing function with the commands the user wants. In other words, your CLI module could have a configuration type that looks something like

data Configuration = Configuration { prompt :: String , legalCommands :: [Command] -- ... }

and then you also write a "process user input" function that takes a configuration, an input string, and returns an output string:

process :: Configuration -> String -> String process config input = let cmdName : arguments = words input parsed = parse (commands config) cmdName arguments in case parsed of Just cmd -> concat [ "Ran '", name (of cmd), "' " , "with arguments '", args cmd, "'. " , "Result: ", exec cmd ] Nothing -> "Unrecognised command."

Given this (and other convenience functions you may export from your CLI module), the user can easily write their own main loop as

import CLI (userInput, process, Configuration{..}) import CLI.BaseCommands (echo, sortCommand) import CLI.Calculator (calculator) import MyFunCommands (allcaps) settings = Configuration { prompt = "myCLI> " , commands = [echo, sortCommand, calculator, allcaps] -- ... } main = forever $ do input <- userInput settings putStrLn (process settings input)

As you see, the user barely has to do anything besides import some of your convenience functions for getting input and processing it, writing down the desired settings and putting it all together in an infinite loop.

(To give you an idea of how flexible this approach is for extensions: since commands are now regular Haskell values, the user of your code can also modify them. For example, if they wish to rename your sorting command from sort to old_sort , they can simply import the value that contains your sorting command, and change the name field before including it in the list of legal commands in the settings. We could also construct commands at run-time. There's nothing stopping you from updating the list of legal commands dynamically, by simply including newly created values of the Command type that weren't defined at compile-time.)

Summary

This was a bit of work to follow, so what did we learn from it? Five things: