To make these rough ideas more precise, I’m going to show you how we can build a little relational language inside Ruby. Of course Ruby isn’t actually relational, so we can’t just make a relation and have it work natively. Instead we’ll design some Ruby objects which allow us to solve problems in a more relational style. That’ll give us some of the flavour of relational programming without having to learn (or invent) a whole new language.

Here’s the big-picture view of how the little relational language works. The programmer’s job is to build a structure called a goal out of Ruby primitives that we provide. A goal contains a rule that places constraints on the values of variables; in this simple language these will just be equality constraints like “ x = y ” or “ z = 4 ”.

Once a goal has been built, the programmer can ask the computer to pursue the goal in a particular input state. (A state is just a collection of variables and their values.) When a goal is pursued in a particular state, the programmer gets back a stream of output states that satisfy that goal. There might be zero states if the goal can’t be satisfied.

Each variable will be a unique instance of this Variable class, and variable equality will be decided by object identity . The name of a variable only affects what it looks like when inspected in IRB.

We can get all this behaviour by introducing a Variable class that doesn’t have any interesting operations at all:

This highlights the fact that variable names have no real meaning, and are just there for convenience and readability. A variable’s name is not guaranteed to be unique, and isn’t part of the identity of the variable.

We want this behaviour even if two variables happen to have the same name. If we compare a variable called x to another variable called x , we should still be able to tell that they’re different variables:

…and we want for two different variables, x and y , to not be equal:

…we want the variable x to be equal to itself…

The only real requirement is that each variable has a unique identity. If we make two separate variables called x and y …

The first thing we need to implement, then, is a concrete representation of variables. How should they work?

Our little relational language has first-class variables, just like functional languages have first-class functions. In most languages a variable is only a convenient name for a value, but in relational languages a variable is something concrete that you can explicitly pass around a program like any other value.

This implementation recursively looks up a value in a state. If a value has an assignment in the values hash, #value_of retrieves its assigned value, and then tries to look that up. That process repeats until it reaches a value with no assignment and returns it.

The value of a is just a again, because a hasn’t had anything assigned to it.

That’s even the case if we ask for the value of a variable that hasn’t been assigned yet:

Asking for the value of anything that’s not an assigned variable just returns that value:

#value_of lets us ask for the value of x and get the answer 5 by repeatedly traversing variable assignments.

Since variables can be assigned to other variables, it’s helpful to have an operation called #value_of that can find the ultimate value of a variable:

So, those are the two ways that a state develops over time: #create_variables grows the collection of declared variables, and #assign_values grows the collection of values assigned to them.

#create_variables instantiates some new variables and adds them to the array of declared variables, and #assign_values merges the new assignments with the existing assignments. There’s nothing interesting going on.

The new state knows that x ’s value is whatever y is, and y ’s value is whatever z is, and z ’s value is 5 . That kind of means that x ’s value is 5 , but the state only indirectly records that information.

The above code uses #assign_values to assign y to x , and z to y , and 5 to z , which returns a new state that remembers all those assignments.

#create_variables returns two things: a new state where those variables have been declared, and the new variables themselves. (We can assign each new variable to a separate Ruby variable by using destructuring assignment .) Note that #create_variables doesn’t change the original state — it just returns a new one.

States are immutable, so they don’t need setters, but they do need two basic operations: we must be able to declare new variables inside a state, and we must be able to assign values to those variables.

We’ll give our states some getters so that we can easily look inside and see what variables have been declared and what values, if any, they have:

A state contains an array of all the variables that exist and a hash of what values some of those variables have been assigned. (A “value” here means anything that can be compared for equality — in our case, any Ruby value. Our little language doesn’t care about the structure of values, only that they can be compared for equality.)

Once we have variables, we can package them up into states, which associate variables with possible values. This is what states look like:

To make its two arguments equal, #unify first uses #value_of to find their final values in the current state. If those values are already equal then it returns the current state; otherwise, if either value is a variable, #unify adds an assignment to the state to make that variable equal to the other value. If neither value is a variable, it falls off the end of the if - elsif expression and returns nil to indicate failure.

This happens because x and y are already both equal to 5 , so there’s no way to make y be 6 without undoing some of the existing assignments.

If we try to make either variable equal to a different value — for example, if we try to make y equal to 6 — then unification will return nil to indicate failure:

If we ask again for the value of x , we’ll see that it’s now 5 as requested:

x had already been assigned the value y , so unification has made x equal to 5 by adding an assignment that says y is 5 .

If we then try to unify x with some other value, like 5 , here’s what happens:

Unification has made x and y equal by adding an assignment to the state — x now has the value y .

If we unify two different variables — for example, if we try to make x equal to y — then we should get back a state that makes them equal:

No changes are required to make x and x equal, because they’re already equal, so unification does nothing and we get the original state back.

Here’s an example. If we start with an empty state, give it two variables x and y , and then try to unify x with itself in that state, then unification should succeed and return the same state:

Now that we’ve built variables and states, we can move on to implementing unification, which is the workhorse of this little language. Unification is the process of making two values equal in a particular state, and it works by adding assignments to that state.

μKanren is part of the miniKanren family of languages. The full miniKanren language has more complex primitives that can be built out of the μKanren ones I’ve shown you.

Surprisingly, these six pieces are enough to make a simple relational language called μKanren. It was presented in a paper published only two years ago, in 2013, by Jason Hemann and Daniel Friedman.

It may feel like we’ve seen a lot of code, but really there’s not much there; it could all fit on one piece of paper, and most of the work is just data structure administration. We’ve made six simple building blocks: variables, states, and the four kinds of goal — make two values equal, provide local variables to an existing goal, pursue two existing goals separately, and pursue two existing goals together.

This pursues the goal in the first state in the stream, then interleaves the results of that with the results of pursuing the goal in each remaining state.

It pursues the first goal in the initial state to get a stream of results, and then pursues the second goal in each of those. But again we’re relying on an unimplemented method, this time Goal#pursue_in_each , so here it is:

Pursuing that goal doesn’t produce any results, because there’s no way to make x be both 1 and 2 .

And if we try to pursue two incompatible goals simultaneously, we should get an empty stream back:

If we pursue that in an empty state, we should get back one state where a is 7 and b is 5 , and another where a is 7 and b is 6 :

We can use both can build up more complex goals as well. Here’s a goal that says we want a to equal 7 and b to equal either 5 or 6 :

Sure enough, we get one state back that says x is 5 and y is 7 :

Here’s a simple example. We should be able to combine two goals that each produce one state to get one final state; let’s say we want both x to equal 5 and y to equal 7 :

It combines two goals by pursuing the first goal to get a stream of states, then pursuing the second goal in each of those states and combining the many resulting streams together. So each state in the final stream satisfies both goals, because it was produced by incrementally satisfying the first goal and then the second one.

The other combining goal is called both . It’s slightly more complicated.

The inner loop of this implementation bounces back and forth between the two enumerators until they’ve both finished. It relies on Enumerator#next raising a StopIteration exception when each stream runs out, which breaks out of the inner loop and discards the finished stream. If either enumerator is infinite it just keeps going forever.

It would be helpful if this worked even when the streams are infinite. So if we have an infinite stream of the letters 'a' , 'b' and 'c' repeating over and over, and an infinite stream of the numbers from 1 up to infinity, we should still be able to interleave them:

Notice that we get a letter, then a number, then another letter, then another number and so on, and when the numbers run out we just get the rest of the letters.

We should be able to interleave, say, a stream of letters and a stream of numbers to get a stream of both letters and numbers:

It pursues each goal separately, then combines the resulting streams by interleaving them. Unfortunately, there is no Enumerator#interleave_with method in Ruby, so we have to implement that too.

One of them says that x is 5 , and the other one says x is 6 .

When we pursue that goal in an empty state, the stream we get back has two result states in it:

Here’s an example of that. We want to be able to say either x equals 5 , or x equals 6 :

The first combining goal is called either . It combines two goals by pursuing each of them independently and combining their output streams. Every output state from an either goal satisfies either its first or its second subgoal.

So those are the two basic goals: making values equal, and introducing local variables automatically. There are two other kinds of combining goal that we can use to plug together the basic goals to make larger and more interesting ones.

We can call Goal.with_variables with a block and get back a goal. When that goal is pursued, it automatically creates the right number of variables for our block, calls it with them, and pursues whatever goal is returned.

This is pretty easy to implement in Ruby because we can examine a block with Proc#parameters :

This time we didn’t have to create the variable x ourselves — when we pursue the goal and inspect the resulting states, we can see that it’s been created for us.

For example, we can take an equal goal that expects to use a variable called x , and wrap it up into a with_variables goal so that the local variable x is automatically created when we need it:

Explicitly creating variables just to pass them to a goal is a bit inconvenient, which is why we have the other kind of basic goal, called with_variables . The job of a with_variables goal is to run an existing goal and automatically provide it with as many local variables as it needs.

The code inside the Enumerator.new block yields an output state if unification was successful, otherwise it does nothing, so the resulting stream either contains a single state or is empty.

To construct an equal goal we have to provide the two values, a and b . When the goal is pursued in a particular state, it unifies a and b in that state and produces a stream of states — an enumerator — as its result.

If we try to retrieve another result from the stream we get a StopIteration exception, because the goal only produced one state:

So the goal has succeeded in making x equal to 5 .

The first result in the stream is a state that says the value of x is 5 :

(We’re using an enumerator to represent a stream here. An enumerator is just an object we can call #next on and keep getting more values out.)

If we pursue that goal in our original empty state, we get back a stream of results:

Here’s an example. We’ll make a new state that has some variables in it, and then use a Goal.equal factory method to make a goal that says x is equal to 5 :

The first kind, called equal , is the only kind of goal that isn’t made out of other goals. An equal goal contains two values, and when it’s pursued in a particular state, it tries to make its two values equal by unifying them in that state.

There are only four kinds of goal in the language we’re building, and two of them are very basic.

A goal simply wraps up a block of code, and to pursue that goal in a particular state, the block is called with the state as an argument. So “goal” is really just a nice name for a Proc , but having a dedicated Goal class gives us somewhere convenient to put methods for building different kinds of goal.

Now we’re ready to introduce goals and the operations that build them. Every goal will be an instance of this class:

What can it do?

So far we haven’t seen any convincing examples of our little language doing anything good — we’ve just been making variables equal to things, which is a bit underwhelming. To do something actually useful we need the goals to be able to work with data structures instead of just opaque values.

Fortunately, instead of trying to support every possible data structure, we can just support one: pairs. If a user of this language wants to use a more sophisticated data structure, they can build it out of pairs.

Pairs Here’s how a pair should work: >> pair = Pair.new(5, 9) => (5, 9) >> pair.left => 5 >> pair.right => 9 We should be able to make a pair of two values, and then later retrieve the left or the right value from the pair. Fortunately Ruby makes that very easy for us to implement. We can just use a Struct : Pair = Struct.new(:left, :right) do def inspect "(#{left.inspect}, #{right.inspect})" end end The implementation of #inspect lets us see nicely-formatted values in IRB. Getting our language to “support pairs” means teaching unification to look inside them. For example: >> goal = Goal.with_variables { |x, y| Goal.equal( Pair.new(3, x), Pair.new(y, Pair.new(5, y)) ) } => #<Goal @block=#<Proc>> This goal is trying to make a pair of 3 and x equal to a pair of y and a nested pair of 5 and y . It might not be obvious how to make those things equal, but our unification procedure needs to be able to do it: >> states = goal.pursue_in(State.new) => #<Enumerator: #<Enumerator::Generator>:each> >> state = states.first => #<State @variables=[x, y], @values={y=>3, x=>(5, 3)}> >> state.values => {y=>3, x=>(5, 3)} The answer is that y has to be 3 and x has to be the pair (5, 3) . This can only work if unification knows how to look inside a pair and make it equal to another pair. That’s actually pretty easy to do. We can take our existing definition of #unify and add an extra clause to it: class State def unify(a, b) a, b = value_of(a), value_of(b) if a == b self elsif a.is_a?(Variable) assign_values a => b elsif b.is_a?(Variable) assign_values b => a elsif a.is_a?(Pair) && b.is_a?(Pair) state = unify(a.left, b.left) state.unify(a.right, b.right) if state end end end If #unify finds that it’s trying to make two pairs equal, it just unifies the left elements of the pairs and then unifies the right elements in the resulting state. We need to teach #value_of to go inside pairs as well: if we ask it for the value of a pair like (x, y) , it should be able to extract x and y from the pair, find both their values, and return the pair of them as a result. It’s easy enough to do that by adding a clause to #value_of : class State def value_of(key) if values.has_key?(key) value_of values.fetch(key) elsif key.is_a?(Pair) Pair.new( value_of(key.left), value_of(key.right) ) else key end end end Another little convenience is that the state already keeps track of all the variables that have been declared within it, so instead of looking up variables explicitly by asking “what’s the value of the variable called x ?”, we can just ask the state to give us the values of its first few variables: class State def results(n) variables.first(n). map { |variable| value_of(variable) } end def result results(1).first end end The #results method finds the first n variables in the state and returns their values, and the shorthand #result method just returns the value of the state’s first variable. That’s nice because now, whenever we have a state, we can just ask “what’s the value of your first two variables?”, without needing to handle the variables ourselves: >> state.values => {y=>3, x=>(5, 3)} >> state.variables => [x, y] >> state.results 2 => [(5, 3), 3] And it’s even nicer to just be able to ask, “what’s your result?”: >> state.result => (5, 3) This is just a little convention — when we receive a state containing lots of variables, we can ask it for its result and get the value of its first variable. Pairs are a little bit boring, but now that we’ve supported them in #unify and #value_of we can build other data structures out of them, and then things start to get interesting.

Lists For example, as you may know, we can use pairs to encode an array by using a representation called a list. If we want to represent an array of three values, we can do that by making three nested pairs containing those values along with a magic value that means “empty list”. Here’s how we could do that in Ruby: EMPTY_LIST = :empty def to_list(array) if array.empty? EMPTY_LIST else first, *rest = array Pair.new(first, to_list(rest)) end end First we pick some special value to represent the empty list — we’re just using the symbol :empty here — and then we write a helper method to recursively convert an array into a list by building nested pairs. If the array is empty then the helper method returns the special empty list value, otherwise it extracts the first element of the array and pairs it up with the result of turning the rest of the array into a list. If we call #to_list with an array of strings, we get back a list made from nested pairs: >> to_list ['a', 'b', 'c'] => ('a', ('b', ('c', :empty))) We’d like to be able to convert a list back into an array as well. Here’s how to do that: def from_list(list) if list == EMPTY_LIST [] else first, rest = list.left, list.right [first, *from_list(rest)] end end If #from_list is called with the empty list then it returns an empty array, otherwise it recursively converts the tail of the list to an array and prepends the head of the list. So we can turn a list back into an array too: >> from_list \ Pair.new('a', Pair.new('b', Pair.new('c', EMPTY_LIST) ) ) => ['a', 'b', 'c'] Now that we can turn Ruby’s native arrays into lists, and turn those back into arrays again, we have a way of talking about arrays in our goals. That means we have the power to write goals that make arrays equal. For example: >> goal = Goal.with_variables { |x, y, z| Goal.equal( to_list([x, 2, z]), to_list([1, y, 3]) ) } => #<Goal @block=#<Proc>> This goal tries to make the array [x, 2, z] equal to the array [1, y, 3] . Here’s what happens when we pursue it: >> states = goal.pursue_in(State.new) => #<Enumerator: #<Enumerator::Generator>:each> >> states.next.values => {x=>1, y=>2, z=>3} Pursuing the goal produces a state where x is equal to 1 , y is equal to 2 and z is equal to 3 . So the goal has been able to compare the two arrays and match up their elements even though we haven’t built explicit support for arrays into the language, because the arrays have been encoded using pairs. This is particularly interesting because we can also use goals to define operations on the structure of lists, not just individual opaque values inside them. Here’s a method called #append : def append(a, b, c) Goal.either( Goal.both( Goal.equal(a, EMPTY_LIST), Goal.equal(b, c) ), Goal.with_variables { |first, rest_of_a, rest_of_c| Goal.both( Goal.both( Goal.equal(a, Pair.new(first, rest_of_a)), Goal.equal(c, Pair.new(first, rest_of_c)) ), append(rest_of_a, b, rest_of_c) ) } ) end #append builds a goal which says: the lists a and b joined together are equal to the list c . It’s not worth dwelling on the implementation details, but notice that it’s just built out of our four basic goals. Briefly, it works by saying: either a is equal to the empty list, in which case b and c are the same, because appending something to the empty list doesn’t change it;

is equal to the empty list, in which case and are the same, because appending something to the empty list doesn’t change it; or a and c both have the same first element, and if you append the rest of a ’s elements with b , you get the rest of c . (There’s a recursive call at the bottom, which talks about appending some smaller lists.) Don’t get too hung up on the details, but the important thing is that this is essentially a declarative definition of what it means for two lists to be appended to make a third one. It’s written it in a verbose way to make it easier to read, but it’s conceptually very simple. What’s really interesting about this definition is that there’s no real “input” or “output” — it just constrains a , b and c to have a particular relationship. This definition is really more “relational” than it is “functional”: it defines a specific relation called “append”, and any given values of a , b and c may or may not be related in this specific way. We can use goals to query this append relation. For example, we can make a goal which says that ['h', 'e'] appended with ['l', 'l', 'o'] is equal to the variable x : >> goal = Goal.with_variables { |x| append( to_list(['h', 'e']), to_list(['l', 'l', 'o']), x ) } => #<Goal @block=#<Proc>> If we pursue that goal in an empty state… >> states = goal.pursue_in(State.new) => #<Enumerator: #<Enumerator::Generator>:each> >> state.next.result => ("h", ("e", ("l", ("l", ("o", :empty))))) >> from_list _ => ["h", "e", "l", "l", "o"] …we find that the value of x that satisfies this relation is the list encoding of the array ['h', 'e', 'l', 'l', 'o'] . But because our append goal expresses a relation between values instead of a one-way function, we can put that variable in other places. We can ask: if I append x and ['l', 'o'] , and the result is ['h', 'e', 'l', 'l', 'o'] , what is x ? >> goal = Goal.with_variables { |x| append( x, to_list(['l', 'o']), to_list(['h', 'e', 'l', 'l', 'o']) ) } => #<Goal @block=#<Proc>> >> states = goal.pursue_in(State.new) => #<Enumerator: #<Enumerator::Generator>:each> >> from_list states.next.result => ["h", "e", "l"] The answer is that x is ['h', 'e', 'l'] . This is like running a conventional append function backwards, and we get that for free. That’s pretty surprising! Here’s something even more surprising. We can ask: if appending x and y produces ['h', 'e', 'l', 'l', 'o'] , then what are x and y ? >> goal = Goal.with_variables { |x, y| append( x, y, to_list(['h', 'e', 'l', 'l', 'o']) ) } => #<Goal @block=#<Proc>> >> states = goal.pursue_in(State.new) => #<Enumerator: #<Enumerator::Generator>:each> >> states.next.results(2) => [:empty, ("h", ("e", ("l", ("l", ("o", :empty)))))] >> _.map { |list| from_list(list) } => [[], ["h", "e", "l", "l", "o"]] The result state says that x is [] and y is ['h', 'e', 'l', 'l', 'o'] , so it’s found values of x and y that satisfy the constraint. But there are more states in the stream! Let’s get the next one: >> states.next.results(2).map { |list| from_list(list) } => [["h"], ["e", "l", "l", "o"]] It says that x is ['h'] , and y is ['e', 'l', 'l', 'o'] . The next state says x is ['h', 'e'] and y is ['l', 'l', 'o'] : >> states.next.results(2).map { |list| from_list(list) } => [["h", "e"], ["l", "l", "o"]] In fact, if we just start the stream again and iterate over it, we get all possible combinations of values that make ['h', 'e', 'l', 'l', 'o'] when appended together: >> states = goal.pursue_in(State.new) => #<Enumerator: #<Enumerator::Generator>:each> >> states.each do |state| p state.results(2).map { |list| from_list(list) } end [[], ["h", "e", "l", "l", "o"]] [["h"], ["e", "l", "l", "o"]] [["h", "e"], ["l", "l", "o"]] [["h", "e", "l"], ["l", "o"]] [["h", "e", "l", "l"], ["o"]] [["h", "e", "l", "l", "o"], []] => nil So not only can we run functions backwards, we can also discover multiple “inputs” that produce the desired “output”. It’s pretty interesting that we can get that behaviour just by expressing what it means to append two lists, and also interesting that this functionality emerges from the simple primitives we built. Under the hood this is happening through a fairly mundane search strategy encoded in the primitives we built, but the useful thing is the way we’ve expressed the computation we’re interested in and the flexibility of how we’ve been able to interact with it.