Intro • Part 1 • Part 2 • Part 3

This post is part 1 in the series: “Gain confidence with Haskell!”, where I’ll be going over some reasons why I really enjoy coding in this up-and-coming language. In this post, I’ll be going over what type systems are, what makes them really powerful tools for developers, and how Haskell’s type system gives me more confidence in my code than if I were using a different language.

Numbers? Letters? It’s all the same!

When a computer runs a program, it’s ultimately just processing a bunch of 1’s and 0’s. Some 1’s and 0’s represent numbers, some represent words, and some represent lists. How do 1’s and 0’s represent our numbers, typically written with digits 0-9? Computers use the binary system, where each digit represents a power of 2, instead of a power of 10.

Words are also represented by these binary numbers, where each letter is assigned an 8-digit binary number.

Even instructions can be represented by numbers! Imagine that the “ADD” instruction is represented by a 1, or the “MULTIPLY” instruction is represented by a 2. This is a really powerful concept that Alan Turing postulated in the 40’s: since everything can be represented by numbers, if a computer just knows how to manipulate numbers, it can manipulate any kind of data!

However, this revolutionary aspect of a computer is also its downfall: if everything is a number, how can programmers ensure that their functions are run with the correct kind of data? Let’s say a programmer writes an “add” function:

function add(a, b) { return a + b } // outputs 7 add(2, 5)

What’s to prevent another programmer from coming along and calling the “add” function with a letter?

add('A', 1)

The computer certainly doesn’t know any better; it might take the letter ‘A’ represented as 65, add 1, and output 66. It might even convert 66 back into the corresponding letter, ‘B’. But any human would look at that program and recognize that it doesn’t make sense to add a letter and a number. So when we write languages that are able to interpret programs like the one written above, we dictate rules about what types of data are allowed with a given operator.

Handling nonsensical programs

Each language defines (1) what operators they use and (2) what types of data are allowed with each operator. For example, + might be valid for two numbers ( 1 + 2 = 3 ) or two letters ( 'A' + 'B' = "AB" ), but not for any other types. What happens if a program tries to use the operator (e.g. + ) on invalid types (e.g. 1 + 'A' )?

Some languages, like Python, will start running the code and throw an error when the computer tries to execute an invalid operation. In this example, the Python program will attempt to execute this code and fail when the code tries to add a number ( int , or integer) and a word ( str , or a “string” of characters).

# outputs 7 2 + 5 # TypeError: unsupported operand type(s) for +: # 'int' and 'str' 1 + 'A'

Some languages, like Javascript, will run the code and try its very hardest to do whatever you tell it to. This example shows how Javascript, unlike Python, supports adding letters and numbers — it simply converts the number into its letter representation before concatenating the two letters.

// outputs 7 2 + 5 // outputs "1A" 1 + 'A' // outputs "31" '3' + 1

In this next example, Javascript would run the following code just fine, even though we’re trying to get the name property of textual data, which doesn’t make sense.

function getName(person) { return person.name } let personAlice = { name: "Alice", age: 20 } // sets `aliceName` to: "Alice" let aliceName = getName(personAlice) // sets `badName` to: undefined let badName = getName("Not a person!")

Javascript tries to avoid throwing an error as much as possible, so it outputs that the name property of the text "Not a person!" is undefined , a special value in Javascript indicating a lack of value. But Javascript still has its limits, and will still throw an error as a last resort. Watch what happens when trying to get the length of the names we got:

// outputs 5 aliceName.length // TypeError: Cannot read property 'length' of undefined badName.length

In both the first Python case and the latter Javascript case, the program throws an error — that is, it will crash — when it tries to do something nonsensical. Think about every time a program on your computer has crashed. It’s a terrible user experience, right? A better solution would be to have the program intercept the error and do something with it (e.g. display it in a popup) instead of crashing and losing all of your work.

How does this situation happen in the first place? Errors happen when programmers introduce bugs into the code. Maybe some function expected data in a certain format, and was executed with data in an invalid format. No matter the scenario, in every case, a bug happens when a programmer forgets to check that certain conditions are met before calling a function requiring those conditions to be met. Some languages try to solve this by checking these conditions for the programmer.

How to prove that your program works

Languages like Javascript or Python are called “dynamically typed” languages. Languages like Java, C++, Typescript, and Haskell are known as “statically typed” languages, because they allow the programmer to annotate the data in a program with their types. The developer then runs a separate process (called “compiling“) that converts the code into 1’s and 0’s, which checks for invalid operations when the developer converts the code. The more effectively the language checks invalid operations during the compiling step, the less likely errors would happen when the user actually runs the program.

Note: previous versions of this post used the term “untyped” code, which is a technically inaccurate description. “Untyped” code here refers to “dynamically typed” code as mentioned above.

Using the above Javascript example, Typescript would check all of the types during the compile step, throwing an error at the developer if something doesn’t look right.

// A `Person` is an object with a `name` property // containing text and an `age` property containing // a number type Person = { name: string age: number } // The function `getName` takes in a `Person` and // returns a `string`, ostensibly representing the // Person's name. function getName(person: Person): string { return person.name } let personAlice: Person = { name: "Alice", age: 20 } let aliceName: string = getName(personAlice) // Compile-time error: // Argument of type 'string' is not assignable to // parameter of type 'Person'. let badName: string = getName("Not a person!")

Again, the Typescript compiler would catch this error for the developer and would completely refuse to continue compiling the program. Since the compiler halts, the developer wouldn’t have anything to give to the user, forcing the developer to fix the error in order to have something that they can give to the user. This workflow might seem more frustrating for the developer, but it’s so much easier to resolve problems from the start than having something crash on users and try to figure out what they did to make the program break.

Typescript definitely helps a lot, but it’s built on top of Javascript, which means you’ll often come across situations where it’s difficult to specify the types. Because of this, Typescript has an any type that allows the variable to be any type. Consequently, you can always get around the type system, and in some cases, you might be forced to work around the type system. Especially when you want to use Javascript libraries in your Typescript code, it can be difficult to add types to code that weren’t initially written with types in mind.

What about a statically typed language that doesn’t have the baggage of a dynamically typed language, like Java or C++? There aren’t any obvious escape hatches like Typescript’s any type that let you write untyped and unchecked code. But both Java and C++ have a funny thing called null , which is a value that can be any type!

String x = null; x.toUpperCase();

The first line is valid, because null can be any type. The second line is valid, because toUpperCase is a valid function on String values. However, this will error when the program is run, because null does not have a toUpperCase function. This error is called a “NullPointerException”, and it is notorious for cropping up quite frequently, so much so that its inventor calls it a “billion-dollar mistake“.

You might say that these are very contrived examples and could be avoided with best coding practices, and you’d be right. However, everyone has days they just want to finish their work, even if it means being a little sloppy. Using a language that has an easy out of the type system makes it that much easier for you to do so and call it a day.

Strict rules beget confident code

On the other hand, Haskell is like the strict teacher who nitpicks at every small mistake you make, but your work comes out much better than when you first started. Even if you work primarily in other languages, after learning Haskell, you will naturally become much more disciplined about your code and start to think about ways to prevent bugs from entering your code.

If I could choose one feature in Haskell that immediately caught my attention, I would choose the fact that it exhaustively checks values. Whenever you use a value, Haskell checks to make sure you’ve handled all the possible values that could occur. For example, you might write a function that takes in a command and a number and does some operation to the number:

applyOperation :: String -> Int -> Int applyOperation "plus_one" x = x + 1 applyOperation "double" x = x * 2 applyOperation "always_zero" _ = 0

But Haskell will notice that you haven’t handled all the cases — what if you intended to handle "minus_10" but forgot to implement it? Or what if the function is called with some nonsense string, like "turtle" ? If this were written in another language, you might move on and something unexpected would happen when trying to run the function with an unexpected input. But Haskell will actually display an error like the following:

MyModule.hs:2:40: error: [-Wincomplete-patterns, -Werror=incomplete-patterns] Pattern match(es) are non-exhaustive In an equation for ‘applyFunction’: Patterns not matched: [] _ (p:_) _ where p is not one of {'a', 'd'} ['d'] _ ('d':p:_) _ where p is not one of {'o'} ...

and force you to resolve it before working on other parts of your code. So how might this example be better written, taking advantage of Haskell? Allow me to introduce another great feature in Haskell: sum types. With sum types, you might rewrite the above as:

data Operation = Plus Int | Double | Always Int applyOperation :: Operation -> Int -> Int applyOperation (Plus y) x = x + y applyOperation Double x = x * 2 applyOperation (Always y) _ = y

In this example, applyOperation takes in an Operation and a number, where Operation is a type consisting of exactly three values: Plus <some number> , Double , or Always <some number> . Note that some Operation s have an argument ( Plus and Always ) and some don’t ( Double ). Now, if we handle all three of these cases in applyOperation , we guarantee that no one can call applyOperation with a value we didn’t expect; e.g. applyOperation "double" 4 won’t work because "double" is a String , not an Operation .

applyOperation (Plus 10) 20 == 30 applyOperation Double 100 == 200 applyOperation (Always 42) 1 == 42 -- does not compile: applyOperation "double" 100

Haskell also provides excellent sum types out of the box. One of these useful types is Maybe a , which can either be the value Nothing or Just a , where a stands for any type you want. This means that, in contrast with languages like Java where a function returning Person implies that it could still return null , Haskell makes it explicit that the function returns Maybe Person , where you can either have Just <a person> or Nothing , and (because of exhaustiveness checking) you have to handle the Nothing case!

-- | Return 'Just <person>' if the person with the -- given name exists in the given database, -- otherwise return 'Nothing' getPersonWithName :: DB -> String -> Maybe Person getPersonWithName db name = ... -- | Get the preferred greeting for the given -- person. getGreeting :: Person -> String getGreeting person = "Hello " ++ getName person main = do let db = ... let greeting = case getPersonWithName db "Alice" of Just person -> getGreeting person Nothing -> "Could not find person named Alice" putStrLn greeting

You might wonder if the pain of being annoyingly explicit is worth it. I will mention that there are nice shortcuts I haven’t illustrated here that make it a little less painful, but it is still frustrating at times. And yet, it’s definitely worth it. Many a time, I would be frustrated by being forced to handle a Nothing and want to yell at my computer, “Trust me! That’s never going to happen!” But that’s exactly how bugs work. They occur in situations where the programmer didn’t expect something to happen, yet it did. It might not even be the programmer’s fault; maybe it was a valid assumption at the time, but requirements change and assumptions become invalid. When this situation comes up, so often I would look at that Nothing and realize that my team didn’t even consider that situation. This would lead to a reevaluation of the problem and we would come out with a stronger solution and a better product.

One more thing I want to point out about Haskell is how easy it handles polymorphism. For example, if you want to write a function to get the first element in the list, you shouldn’t have to say whether it’s a list of numbers or a list of strings. And Haskell lets you do that:

-- | Return the first element in the list, or -- Nothing for an empty list. firstElement :: [a] -> Maybe a firstElement [] = Nothing firstElement (x:_) = Just x

First, notice that we couldn’t just write the last line as firstElement (x:_) = x because Haskell would tell you that you haven’t handled the case where the list is empty. But if a list is empty, there’s no first element to get. In comes our handy Maybe type, which allows us to have something (or, in this case, Nothing ) to give back when we get an empty list. But also notice how we never need to know the type of x . We simply take it out of the first position of the list and wrap it in Just ; this function does not care whether x is an Int or String or MySecretType .

firstElement [] == Nothing firstElement [24, 31, 9] == Just 24 firstElement ["hello", "there"] == Just "hello"

But since firstElement claims it works for any type a , Haskell checks to make sure you’re not violating that. If you tried to write a function that doesn’t work for every a :

describeRelationship :: a -> a -> String describeRelationship a b = if a == b then "Equal" else "Not equal"

Haskell will error saying that not every type defines the == operator, that is, has the ability to check equality between two values. You’ll have to add a constraint that a is a type whose values can be equal or not equal to each other:

describeRelationship :: Eq a => a -> a -> String

This simple example shows that Haskell lets you use polymorphism to write flexible code, but it requires you to list your assumptions up front — in this case, that a is a type that has the Eq constraint. This is a general principle that follows Haskell learners the entire time: Haskell requires explicitness over implicitness, which is frustrating at times, but leads to code you can feel more confidence in.

Is Haskell your type now?

To be clear, bad coding practices are bad coding practices. No language will solve that problem. Haskell still has an error function, which can crash your program unexpectedly. And testing is still absolutely necessary; Haskell’s compiler will check that all assumptions required of a function are satisfied, but it won’t check that it’s doing the right thing. And you can definitely write programs in Haskell that will never halt.

However, Haskell’s type system is so restrictive that it forces you to think about the edge cases. You might not care now, but the you four months from now trying to figure out why something unexpected happened will definitely care. And sometimes, it might even make you reconsider how you’re solving the problem and point you towards a stronger solution. I didn’t even scratch the surface of everything Haskell’s type system lets you do, like type classes, monads, or type-level programming. But hopefully this sparked a bit of curiosity into the kinds of problems Haskell’s type system can solve for you. If so, stay tuned for part 2!

Additional References

Update (2.5.20): Removed misleading references to “untyped”/”typed”