2019-09-14 19:15

A lot of friends and colleagues ask me why I talk about Haskell. Before I learned Haskell I always used mainstream languages like Java, C and C++ - and still like them. So how could it happen that an imperative style developer converts into a Haskell fan? In this article I want to explain it - especially for developers with less experience in functional programming.

At first I will talk about a few topics to compare functional with object oriented programming, because it's the most popular paradigm. In the first code example I will give you a short introduction into the syntax of Haskell, because I will use it throughout this article.

Control flow

The control flow describes how you can tell your program what to do - formulate algorithms. There are three basic control elements:

Sequence - execute code in sequential order

Repetition - execute code repeated

Selection - split code into branches by a condition

Object oriented programming

Sequences are statements, one by one, line by line

Repetitions are loops, like for or while statements, or recursions

or statements, or recursions Selections are if ... else or switch statements

Here is a simple example using Java that centers a text. The text is passed as an array of strings. Every line is an element of that array:

void alignCenter(String[] text) { int maxLength = 0; for (String line : text) { if (line.length() > maxLength) { maxLength = line.length(); } } for (int i = 0; i < text.length; ++i) { int spaceCount = (maxLength - text[i].length()) / 2; StringBuilder builder = new StringBuilder(); for (int j = 0; j < spaceCount; ++j) { builder.append(' '); } builder.append(text[i]); text[i] = builder.toString(); } }

Functional programming

Sequences are chained calls

Repetitions are recursions

Selections are either pattern matching, case ... of or if ... else expressions

Here is the same example in Haskell that shows the usage of pattern matching and a recursion:

alignCenter :: [String] -> [String] alignCenter xs = alignCenter' maxLength xs where maxLength = maximum (map length xs) alignCenter' :: Int -> [String] -> [String] alignCenter' _ [] = [] alignCenter' n (x:xs) = (replicate spaceCount ' ' ++ x) : alignCenter' n xs where spaceCount = div (n - length x) 2

And here is a short version that avoids recursion by using map and a lambda function:

alignCenter :: [String] -> [String] alignCenter xs = map (\x -> replicate (div (n - length x) 2) ' ' ++ x) xs where n = maximum (map length xs)

A brief introduction to Haskell

The first line of a function is the signature. The signature alignCenter :: [String] -> [String] tells us that we have a function named alignCenter which takes a list of strings as input and returns a new list of strings as output (read from left to right).

The first function determines the longest line inside the list of strings and calls the second function. We terminated the first loop of our oop code by a simple expression maximum (map length xs) . So how does it work? Let's take a look at the signatures of all involved functions:

length :: [a] -> Int map :: (a -> b) -> [a] -> [b] maximum :: [a] -> a

The length function takes a list of whatever type and returns an Int . All lowercase types in a type signature are type variables, similar to the type T in List<T> in Java. I think it's obvious what the function does.

The map function takes two parameters, a function of type a -> b as first and [a] as second parameter and returns [b] . So what does it mean "it takes a function as parameter"? Yes, it's true! You can pass functions as parameters, neither function pointers like in C nor method references like in Java - real functions as first class values. Functions that work with functions as parameters or return new functions as result are called higher order functions. So what does this function do? It passes every element of [a] to the a -> b function which turns the a into a b and collects them into a new list [b] .

Now let's resolve the type variables of map length xs , where xs is of type [String] :

map :: (String -> Int) -> [String] -> [Int]

You need to know that String is a type synonym for [Char] , which denotes a list of chars. That's the reason why it's compatible to the length function. The expression map length ["Hello", "World!"] will resolve to [5, 6] . We are interested in the length of the longest string inside the list, so we pass the resulting list to maximum which returns the biggest element of the list, which is 6 .

Let's take a look at the second function:

alignCenter' :: Int -> [String] -> [String]

You may have noticed the ' at the end of its name. There's nothing special about it, it's just another valid character for identifiers in Haskell, because it's an usual notation in mathematics to represent names that are related to a prior identifier. The function works recursive, we go through every line of the text, do the transformation and prepend the transformed line to the recursive call with all remaining elements.

The line alignCenter' _ [] = [] is the base case of the recursion. It means: If the second parameter is an empty list, then return an empty list, because there's nothing to do. In this case we are not interested into the value of the first parameter, so we don't need to name it and leave it with _ .

The following lines do the whole work:

alignCenter' n (x:xs) = (replicate spaceCount ' ' ++ x) : alignCenter' n xs where spaceCount = div (n - length x) 2

We bind the first parameter to n and pattern match the second parameter, which is a list, against the pattern (x:xs) , which means: Bind the first element of the list to x and all remaining elements to xs . We replicate as many spaces as we need, concat them with the current element x and prepend the result with : to the resulting list of the recursive call with all remaining elements xs . That's all.

It's important to declare the base case of the recursion before the reduction step, because the compiler runs top down and takes the first matching pattern it can find.

Conclusion

We saved a lot of code by using pattern matching and abstract functions compared to the OOP version of the same code above. Okay, now you may complain: "Mhh, you are just hiding the whole code behind library functions like replicate , map and maximum " - and I tell you: "Yes, sure! Because I don't need to write the same for loops hundreds of thousands of times!". To be honest, the Java code could use something like leftPad to replicate the spaces, but it's a very concrete function, specially for the purpose to fill up strings, nothing else. In functional programming you are able to abstract common loop cases in a simple way to do tasks like mapping, filtering, folding and unfolding. In OOP you won't achieve a comparable elegant solution without a lot of boilerplate code like interfaces behind the scenes and builtin syntactic sugar.

Concepts

The concepts describe the basic idea how to build up an application. How is code, data and their interaction represented in the respective paradigm?

Object oriented programming

OOP introduced the concept of interfaces, classes, inheritance and objects, which consist of fields as data and methods as code. The methods operate on the fields to mutate the objects state.

Functional programming

The core concept is the function. You can do a lot more things with them as with methods in OOP, for example:

pass functions to other functions

return new functions as result of function evaluation

compose two functions to build a new function

partially apply a function to build a new function

The output of a function evaluation only depends on it's inputs. It means that there are no hidden mutable variables that can influence the outcome of a function. That greatly increases testability.

Data is represented by algebraic data types. In functional programming you don't put data and code together in a container like classes. You build a set of data types and a separate set of functions that operate on these types. The data types don't know what functions they are used by, because they know nothing about functions, and each function doesn't know that there are another functions that also operate on the same data types.

Here a few examples of data types in Haskell, just to give you a taste how it looks like:

data Bool = False | True data Customer = Customer Int String data Customer' = Customer' { customerId :: Int, customerName :: String }

There's always a data type name and a | -separated list of constructor names with optional parameters. The first example is straight forward. The second example has one constructor with the same name as the type, and two parameters. The last example is equally to the previous, but with named parameters, which is called record syntax.

Data in Haskell is immutable, which means, you can't change the value of the name of a Customer , instead you need to create a new one with the changed name.

Conclusion

Let's say, you have a real world problem that you need to solve. What are your first steps? You try to decompose the problem into smaller problems, and smaller problems, and so on. Then you need to describe your problem, which means that you put your problem into the slang of the programming language of your choice.

In the case of OOP you need to discover classes together with their fields and methods, find similarities, put them into abstract classes and in the end derive them to build concrete classes you can work with.

In FP you start with functions. One function for one very small problem, which works with very small types. In the best case the types contain exactly the information that your function needs, not more or less. That ensures that both, the type and the function, almost never need to change, except your problem changes, even when you completely refactor the rest of your application. It turns out that you also decompose your logical types or business entities into small technical types, which enables painless and safe refactorings.

Coupling

Coupling says how strong components relate to each other and will be effected by changes of one component.

Object oriented programming

Objects that communicate with each other in a direct way are tightly coupled. One way to limit the coupling is to apply principles like dependency inversion, which says you should communicate over abstractions, e.g. interfaces, instead of implementations, e.g. classes.

You define interfaces for the implementations you want to be able to exchange. To prevent big generalized interfaces you should choose only a few high cohesive methods for one interface - that's called interface segregation. If you don't get it right, you will likely end up with dummy interface implementations in the long run, like throwing UnsupportedOperationException or returning dummy values in empty method bodies.

When it comes to interface implementations, you often add abstract classes that implement parts of your interface and leave interface methods untouched to be implemented by concrete implementations - that's the way how inheritance works, which is the tightest coupling in OOP.

The idea of object orientation and inheritance is to bring programming closer to the real world. You all know examples like: "There are two base classes Vehicle and Ship for two derived classes Car and Truck . But what do you do with Amphibian ?". It shares characteristics with both base classes - so you would need multiple inheritance, which is a bad idea, because of the diamond problem. To solve those problems developers introduced principles like composition over inheritance, which says you should compose objects out of exchangeable components. Obviously, composition over inheritance strives a bit against one of the original key concepts of OOP - which is inheritance.

As you can see, it's all about the right class and interface structure - and there are a lot more principles, anti-principles and patterns you need to care about to come up with a good software design.

Last but not least, a simple example that shows the use of the dependency inversion principle to loosely couple a sort algorithm to the compare logic, using an interface as abstraction:

interface Comparator<T> { int compare(T o1, T o2); } class ArcaneComparator<T> implements Comparator<T> { public int compare(T o1, T o2) { // Insert arcane compare implementation here } } class Arrays { static <T> void sort(T[] a, Comparator<? super T> c) { // Uses comparator c, // without knowledge of concrete implementation } }

Functional programming

In FP you compose instead of couple components. Loosely couple functions in FP means abstracting functions by recognizing similarities, factor out details, build higher order functions and parameterize them with details.

Let's take a look at the following situation:

sortById :: [Customer] -> [Customer] sortByName :: [Customer] -> [Customer]

There are two functions that do quite the same thing - they sort, by some criteria. So why we don't put the similarities into a new function to prevent duplication?

data Ordering = LT | EQ | GT ... sort :: (Customer -> Customer -> Ordering) -> [Customer] -> [Customer] compareId :: Customer -> Customer -> Ordering compareName :: Customer -> Customer -> Ordering

or with a type synonym:

type Compare = Customer -> Customer -> Ordering sort :: Compare -> [Customer] -> [Customer] compareId :: Compare compareName :: Compare

The first parameter of sort is a function of type Customer -> Customer -> Ordering , which means it takes two customers and returns LT , EQ or GT for either less than, equals or greater than. So what's the difference? We factored out the said criteria by which the list gets sorted. Instead of sortById we can now write sort compareId . If you still want to call it sortById you can easily do it:

sortById :: [Customer] -> [Customer] sortById customers = sort compareId customers

or

sortById :: [Customer] -> [Customer] sortById = sort compareId

The second version may seems a bit unclear for your new functional brain, so I would suggest to better look at the first one. If you are interested how the second one works, you can read about it, it's called eta conversion.

The sort function still depends on the Customer type, which isn't important anymore, because these details were factored out. Only the compare function is interested in the details of the type. So we can replace it with a type variable:

sort :: (a -> a -> Ordering) -> [a] -> [a]

Conclusion

We were able to express the same functionality in either way. In OOP we were using language features like an interface with a class that implements this interface. In FP we had, well, functions. The type a -> a -> Ordering expresses the interface, and every function that matches this type is a potential implementation of this interface.

What I've left open

Monads - because I don't trap into the monad fallacy

Type classes - for no reason

A few facts

Haskell is an implementation of the typed lambda calculus. Quite everything in Haskell is backed by mathematics, like category and type theory.

Haskell is strong statically typed but you rarely have to write down types on your own, because the compiler can globally infer it's type in most cases.

Haskell is faster than you think, even if you can't imagine because of it's abstractness and immutability.

Final words

For me, personally, feels functional programming much more clean than object oriented programming.

You can write the same functionality

more abstract

in less amount of code

with less boilerplate features

and you get

higher maintainability

higher testability

more fun

Thank you for not TLDR-quitting!