TL;DR

How to solve Advent of Code 2018 — Day 2 puzzles in Clojure REPL and have some fun with exploratory programming. If you’ve never done this then it’s a great opportunity to try it — it’s a liberating experience ;-) (BTW, Here’s on-line REPL if you would like to follow along.)

Part 1 — the checksum

Puzzle description

Like most software developers I don’t have free time and didn’t plan to take part in Advent of Code this year. But my daughter came to me crying, telling me that Santa is in trouble and the Elves can’t make it… (I didn’t believe in Santa for a long time, but now that my daughter knows he’s real, I do too, and he’s in trouble!)

If you haven’t read the description of the puzzle, then just stop and do that right now…

How about that! This is a user story! I wish all my stories at work were defined like that! ;-)

If you still haven’t read the puzzle description, here’s a distilled version:

Calculate the checksum of the input strings by multiplying the number of strings with tuples and triples. A string contains a tuple if it has at least one occurrence of exactly two of any letter. A string contains a triple if it has at least one occurrence of exactly three of any letter.

In short: checksum = count(strings with tuple) * count(strings with triples).

Building solution in Clojure REPL

Let’s start with some basics and find out how many pairs and triples there are in the sample data from the puzzle description. To do this, we can start by going over all the strings and calculating the frequencies of the chars:

(def part1-data

["abcdef" "bababc" "abbcde" "abcccd" "aabcdd" "abcdee" "ababab"]) (map frequencies part1-data)

;; ({\a 1, \b 1, \c 1, \d 1, \e 1, \f 1} {\b 3, \a 2, \c 1}...

Nice. We’ve turned each string into frequencies of their chars. We don’t need the chars, only their frequencies per string, so let’s extract them using the vals function:

(map vals (map frequencies part1-data))

;; ((1 1 1 1 1 1) (3 2 1) (1 2 1 1 1) (1 1 3 1) (2 1 1 2) (1 1 1 1 2) (3 3))

Now we’ve got the numbers. As the next step it would be good to get rid of 1s and duplicated counts per string as we count only if a string has a tuple and/or a triple, not how many.

I’ll start by removing the duplicates using the distinct function:

(map distinct (map vals (map frequencies part1-data)))

;; ((1) (3 2 1) (1 2) (1 3) (2 1) (1 2) (3))

Works fine, but with all this nesting the code loses readability. Let’s fix that by using thread-last macro (->>), which will pass the result of one expression to the end of the next expression:

(->> (map frequencies part1-data)

(map vals)

(map distinct))

The result is the same, but with nesting turned into a pipeline.

Before going to the next step a short note on the syntax of anonymous functions in Clojure. The following two expressions are equivalent and return a function that can be called:

(fn [x y] (+ x y))

;; is the same as:

#(+ %1 %2)

In short version of anonymous functions #(… %1 %2) the percent sign can be used to refer to the positional arguments of the function, without giving them names, like in the longer version.

Ok, back to the puzzle… We’ve got rid of duplicates, but still have numbers different than 2 and 3. Therefore, each set of numbers needs to have some filtering:

(->> (map frequencies part1-data)

(map vals)

(map distinct)

;; (map #(... %) data-from-previous-step)

(map #(filter (fn [x] (or (= 2 x) (= 3 x))) %1)))

;; (() (3 2) (2) (3) (2) (2) (3))

The filtering did the job. The empty sequence at the beginning is because it contained only 1s. In this step we applied an anonymous function #(… %) to filter each group from the previous step, using another anonymous function (fn [x] …) as the predicate. In anonymous functions the percent sign can be used to refer to the positional arguments of the function. Here we pass only one argument, so we use %1 and in our case it’s a set of numbers.

Although the filtering isn’t very complicated, I’ll extract it to a separate function to make the pipeline more readable:

(defn only-twos-and-threes [numbers]

(filter #(or (= 2 %1) (= 3 %1)) numbers))

At this stage we’re very close to the final solution. We’ve marked if a string has 2s and/or 3s, but the structure is a bit inconvenient for counting. It would be easier to process just a flat sequence of numbers:

(->> (map frequencies part1-data)

(map vals)

(map distinct)

(map only-twos-and-threes)

(flatten))

;; (3 2 2 3 2 2 3)

Simple as that!

There are many ways to do the counting of twos and threes. To avoid reinventing the wheel we’ll reuse the frequencies function:

(->> (map frequencies part1-data)

(map vals)

(map distinct)

(map only-twos-and-threes)

(flatten)

(frequencies))

;; {3 3, 2 4}

Full Solution

We already know how to get only the values of a map, so let’s do that and pass them to the multiplication function:

(->> (map frequencies part1-data)

(map vals)

(map distinct)

(map only-twos-and-threes)

(flatten)

(frequencies)

(vals)

(apply *))

;; 12

It seems to work! The whole transformation pipeline to calculate the checksum was created in the Clojure REPL.

Of course, it’s not the most efficient solution (if you came up with some nice solution, share it in the comments!), but it was a nice experience to work it out interactively in the REPL.

In the second part of the puzzle, we’ll also work in the REPL, but we’ll write some functions to make the code more readable and apply different concepts.

Part 2 — fabric boxes

Puzzle description

After calibrating “the device” (Day 1 Puzzle) and calculating the checksum for my list of box IDs, I had to help to find the correct boxes, namely those which differ by exactly one letter. The puzzle boils down to the following problem:

Given a list of equal length strings, find two strings that differ by only one letter and return the common letters.

So for the sample data:

["abcde" "fghij" "klmno" "pqrst" "fguij" "axcye" "wvxyz"]

The correct answer is: fgij

As in the first part, we’ll implement the solution in Clojure!

String matching algorithms

Well, my first thought was to find the boxes using a simple brute-force solution, but I decided to do a quick search for string matching algorithms…

After a quick search for phrases like “string comparison algorithm” and “string matching algorithm” I found a few candidates appeared, of which Levenshtein distance is probably the most known. However, Levenshtein distance computes the number of single character edits to transform one string into another, which is more than I needed here. The best fit in this case is Hamming distance — “The number of characters that are different in two equal length strings.” Bingo! Again, the most important skill is to come up with a good search query. ;-)

Hamming distance

The examples section at Wikipedia assured me that it’s the algorithm I was looking for. Went down to the algorithm section and… I couldn’t believe it! The algorithm implementation in Python was dead simple. To calculate the distance (difference) between same-length strings, just compare each pair of chars at each position (equal is zero, not equal is one) and sum that. The algorithm can determine if two strings are different by only one char, sum = 1.

It seems that every loop out there has its own name ;-)

I started the REPL and implemented the algorithm in Clojure:

(defn hamming-distance [s1 s2]

(apply + (map #(if (= %1 %2) 0 1) s1 s2)))

;;#'user/hamming-distance

;;user> (hamming-distance "abc" "abc")

;;0

;;user> (hamming-distance "abc" "def")

;;3

;;user> (hamming-distance "abc" "acc")

;;1

Basically the core of the AOC puzzle is a one-liner!

Solution walkthrough

For sure, bare implementation of Hamming distance is not enough to solve the puzzle. Data needs to be traversed somehow and things need to be glued…

We know that boxes contain prototype fabric if they differ (Hamming distance) by exactly one character. Let’s write a predicate for this:

;; By convention in Clojure predicates end with ?

(defn fabric-boxes? [id1 id2]

(= (hamming-distance id1 id2) 1))

A function to extract the common letters of the two correct box IDs:

(defn common-chars [s1 s2]

;; Map over chars in strings s1 and s2 at the same time.

;; If the chars in pair are the same return the char else nil.

;; The map will return a list of common chars with one nil.

;; (apply str chars) will pass the chars to str, which will

;; create a string, discarding all nils.

(apply str (map #(if (= %1 %2) %1 nil) s1 s2)))

Having these tiny functions we take an ID and try to find the matching box and extract common chars when found. Going through whole sequence just to get at most one result sounds like… reduction!

;; Reduction function: takes two ids and returns their common

;; chars if they are fabric boxes, else nil.

(defn find-common-chars [id1 id2]

(when (fabric-boxes? id1 id2)

;; reduced function stops further reduction with wrapped result

(reduced (common-chars id1 id2)))) (defn try-match [id boxes]

;; Id as initial value to scan all boxes one by one.

;; The accumulator passed to anonymous function #()

;; is discarded to return nil or common chars.

(reduce #(find-common-chars id %2) id boxes))

The last part is to take the ID of each box and try to find a similar ID in the rest of the boxes:

;; Destructuring of input sequence into: first id and sequence of others.

(defn scan-boxes [[id & boxes]]

(when (seq boxes)

(or (try-match id boxes) (recur boxes)))) ;; Let's try that on sample data from the puzzle:

(def data ["abcde" "fghij" "klmno" "pqrst" "fguij" "axcye" "wvxyz"])

;; user> (scan-boxes data)

;; "fgij"

Full solution

(defn hamming-distance [s1 s2]

(apply + (map #(if (= %1 %2) 0 1) s1 s2))) (defn fabric-boxes? [id1 id2]

(= (hamming-distance id1 id2) 1)) (defn common-chars [s1 s2]

(apply str (map #(if (= %1 %2) %1 nil) s1 s2))) (defn find-common-chars [id1 id2]

(when (fabric-boxes? id1 id2)

(reduced (common-chars id1 id2)))) (defn try-match [id boxes]

(reduce #(find-common-chars id %2) id boxes)) (defn scan-boxes [[id & boxes]]

(when (seq boxes)

(or (try-match id boxes) (recur boxes))))

Outro

It took me more time to write this post than to implement the actual solution :-) It was a nice exercise, though, and Clojure made it really fun.

If you followed along in the REPL then you could probably see how convenient it is to experiment and explore different options.

Anyway, I hope the code is simple and was easy to follow, even if you don’t know Clojure (yet!). If you solved it in some other way, please share in the comments! The first part especially could be simplified in many ways.

Kudos to Jan Paw and Konrad Jakubiec for all the corrections and suggestions!

Resources