As part of my dive into the Collective Intelligence series, I’ve found myself many times taking code that was written in an imperative coding style and moving it towards a pure functional style with immutable data structures instead of ones, the application of functions instead of for loops (which is an upcoming post), and the elimination of state. This was also a major part of my talk this past weekend at the Philly.NET code camp on Functional Programming in .NET.

Recently, Chris Smith, of the F# team fame, wrote a post on F# and the PFX Round 1 in which he tackles the problem of solving the Shortest Path Problem. I had an earlier post on the subject when talking about transposing maps so that we can change a person centric map to an item one, and to also fill in the matrix of cities and distances. But, back to the problem at hand, Chris first tackles the problem using C# and its overly verbose syntax for the problem at hand, and then proceeds to solve it using F# and the ConcurrentDictionary which is part of the upcoming Parallel Extensions for .NET in the 4.0 CLR release.

Stepping back for a second, I thought to myself, the ConcurrentDictionary being a mutable collection, I decided to see if I could rewrite this using nothing but immutable collections and the write in a pure way so that IO does not take place, and instead returns the answer.

The Rewrite

One thing I like to do is sometimes go to Haskell as the language of choice in order to solve the problem functionally. Languages such as F# and Scala are more pragmatic in terms of using imperative and object oriented constructs, but my goal here is to solve in a completely functional manner without any “cheating”.

The first task was to transpose and combine the map so that I fill out the matrix of connections between cities and distances between them. In order to do this, we must fold the outer map with a seed of the incoming map, then fold the inner map, and then transpose the values onto the new outgoing map. Below is the code implementation of that:

import Prelude hiding (lookup) import Data.Map(Map(..), ( ! ), empty, findWithDefault, foldWithKey, fromList, insert, lookup) data City = Boise | LosAngeles | NewYork | Seattle | StLouis | Phoenix | Boston | Chicago | Denver deriving (Show, Eq, Ord) type CityMatrix = Map City (Map City Int) transposeCombine :: (Ord k) => Map k (Map k a) -> Map k (Map k a) transposeCombine m = foldWithKey transpose m m where transpose k1 m ' acc = let transposeInner k2 v acc ' = insert k2 (insert k1 v $ findWithDefault empty k2 acc ' ) acc ' in foldWithKey transposeInner acc m '

Now that we have defined the transposeCombine, we can now take our matrix and fill it out completely with our given cities and distances between the waypoints.

distanceBetweenCities :: CityMatrix distanceBetweenCities = transposeCombine $ fromList [ (Boise, fromList [(Seattle, 496 ),(Denver, 830 ),(Chicago, 1702 )]), (Seattle, fromList [(LosAngeles, 1141 ),(Denver, 1321 )]), (LosAngeles, fromList [(Denver, 1022 ),(Phoenix, 371 )]), (Phoenix, fromList [(Denver, 809 ),(StLouis, 1504 )]), (Denver, fromList [(StLouis, 588 ),(Chicago, 1009 )]), (Chicago, fromList [(NewYork, 811 ),(Boston, 986 )]), (StLouis, fromList [(Chicago, 300 )]), (Boston, fromList [(StLouis, 986 )]), (NewYork, fromList [(Boston, 211 )]) ]

When we call the transpose inside of our distanceBetweenCities function which for example will ensure that LosAngeles will now be linked to Denver, Phoenix and Seattle. Now that we have our complete matrix, we can now define our shortestPathBetween function which takes two cities and will return a tuple of the total distance traveled and a list of cities visited on our journey. As we visit each city, we determine if we’ve been there before, and if we haven’t, we’ll add it to our list, and if not, we’ll determine whether it is the optimal distance. If the distance is the most optimal, we’ll add, else we’ll continue on with the folding through the collection. Below is a representation in Haskell on how to do this. You’ll notice the bang notation for indexes on a map. For lists, that notation shouldn’t usually be used, but for maps is quite acceptable.

shortestPathBetween :: City -> City -> (Int, [City]) shortestPathBetween startCity endCity = shortestPaths ! endCity where shortestPaths = searchForShortestPath startCity 0 [] empty searchForShortestPath currentCity distanceSoFar citiesVisitedSoFar accMap = -- Go through each destination let visitDestinations m = foldWithKey accSearch m $ distanceBetweenCities ! currentCity where accSearch city distance = searchForShortestPath city (distance + distanceSoFar) (citiesVisitedSoFar ++ [city]) -- Insert a destination insertDestination = visitDestinations $ insert currentCity (distanceSoFar, citiesVisitedSoFar) accMap -- Is this an optimal route? in case M.lookup currentCity accMap of Nothing -> insertDestination Just (shortestKnownPath, _) -> case distanceSoFar < shortestKnownPath of True -> insertDestination _ -> accMap

Given this function, we can now calculate the distance between Seattle and New York such as the following in GHCi:

* Main > shortestPathBetween Seattle NewYork ( 3009 ,[Boise,Chicago,NewYork])

But I don’t want to leave the original language out of a possible solution as well, so let’s revisit the solution using F# as well. As always, you can find the complete code here.

The F# Solution

As above, we need to define the the transposeCombine function. In Haskell, I made judicious use of the let in and where as to not need any lambda expressions to keep my code concise and highly readable. When implementing in F#, it’s important to use operators such as the forward or the tupled forward to help us make the code a bit more concise.

#light [ < AutoOpen > ] module Operators = let ( ||> ) (x, y) f = f x y module Map = let transposeCombine m = (m, m) ||> Map.fold_left ( fun acc k1 m ' -> (acc, m ' ) ||> Map.fold_left (fun acc ' k2 v -> acc ' |> Map.add k2 (Map.add k1 v (defaultArg (acc ' |> Map.tryfind k2) Map.empty)) )) type City = | Boise | LosAngeles | NewYork | Seattle | StLouis | Phoenix | Boston | Chicago | Denver

Now that we have this in place, we can now concentrate on the implementation of our shortestPathBetween function which once again, takes a start city and end city and returns a tuple of total distance and a list of all cities visited. The only real differences between the F# version and Haskell is the elimination of the lambda expressions in the Haskell version and the reordering using the let/in and where statements. The rest should stay the same in terms of name and intent:

let shortestPathBetween startCity endCity = let rec searchForShortestPath currentCity distanceSoFar citiesVisitedSoFar accMap = let visitDestinations m = (m, distanceBetweenCities.[currentCity]) ||> Map.fold_left ( fun acc city distance -> searchForShortestPath city (distance + distanceSoFar) (citiesVisitedSoFar @ [city]) acc) match Map.tryfind currentCity accMap with | None -> accMap |> Map.add currentCity (distanceSoFar, citiesVisitedSoFar) |> visitDestinations | Some x -> let (shortestKnownPath, _) = x if distanceSoFar < shortestKnownPath then accMap |> Map.add currentCity (distanceSoFar, citiesVisitedSoFar) |> visitDestinations else accMap let shortestPaths = searchForShortestPath startCity 0 [] Map.empty shortestPaths.[endCity]

So, as you can see, we can create rather concise algorithms for finding the shortest distance between the two cities using immutable data structures, the application of functions, and the lack of state. You can find the full source for the F# solution here.

Conclusion

Where does this leave us? Well, because we’re using immutable data structures, it can lead to any number of optimizations. This is not to say that automatically because we’re having no side effects and immutable data structures, that we’ll automatically be able to parallelize our functions, as some algorithms may be harder than others to accomplish that task of mass parallelization. But, it does give us a path forward for thinking about these problems in new ways. Of course there are other algorithms to explore in this area as well which includes Dijkstra’s algorithm, A*, Floyd-Warshall among others.