Rodney Brooks‘s paper “Intelligence without Representations” was published in 1991. More than 5000 citations speak for the impact this paper still has on robotics and the field of Artificial Intelligence (AI). I highly recommend reading that paper, since our recently published: “The Evolution of Representation in Simple Cognitive Networks” is a (somewhat belated) “direct” response to the opinions offered in that piece.

The problem Brooks addresses deals with this question: “How can we make intelligent robots?” In the 1990s, the field of AI struggled with giving robots internal models of the world. Imagine a robotic chicken looking for food. It will look around until it sees a grain and walk towards it. In case a fence separates the two, the chicken has to turn away to circumvent the fence. This is the crucial moment! If the chicken has the ability to memorize something, it will be able to change its state from “search food” to “circumvent fence”. Once the fence is circumvented, the chicken can change its state back and continue towards the grain. One could say that the chicken has at least some simple internal model of the world that allows it to solve the task. A robotic chicken that would only react to sensor inputs without memory and without internal states will struggle. As long as the chicken doesn’t see the grain, it will search for it. Once it encounters the fence, the chicken needs to turn away, but that means that the chicken will lose the grain from view. This, on the other hand, triggers the search behavior, which will make the chicken look around and rediscover the food and the fence, which in return lets the chicken try to circumvent the fence, which will cause it to lose sight of the grain and so forth. There is a really good chance that a robot chicken without internal states and memory cannot solve this task. While this chicken example is actually fairly simple to solve (and one could easily program the chicken to deal with this world) more complex worlds require more complex internal models. And this is where AI research was stuck in the 90s: How do you program internal models into machines to make them function in complex environments? As it turns out, people aren’t very good in designing such models, which may explain the lack of progress in the field.

To undo this Gordian knot, Brooks suggested that it is possible to do away with all these internal models (representations), by using a “subsumption architecture” that simply reacts to the world (while being hierarchically layered). In the chicken example, you could imagine the chicken to move sideways, never losing track of the grain and solve the task. Brooks showed that a number of interesting tasks could be solved using the subsumption architecture, and stated in his paper that these machines were not using these “representations”, that one could therefore get on with life perfectly well without them. However, it is clear that this research does not imply that people don’t use representations (or that they don’t play a role in human cognition). It implies that using the right computational architecture (one that exploits the features of the world optimally, that is, using the world as its own best representation), you could do just fine.

This is where our research begins. In philosophy, you can find discussion after discussion about the nature of representations, going all the way back to Aristotle. Reality is perceived through representations and it is not clear if in our brain representations are symbols, whether they are sub-symbolic, how representations are grounded, or how they can be manipulated, or used in cognition. But all these question hinge on the nature of representations. To bypass this messy discussion we tried to do something else. Instead of trying to figure out what representations really are, we instead defined a measure to quantify the information content of representations. This measure works regardless of the nature of the representations (be they symbolic or sub-symbolic). We quantify how much information the computational units of the brain have about the environment (the “world”) that is not at the same available from looking directly at the sensors. Representation, defined information-theoretically this way, is what you know about the world without having to look at it. Technically, this means that from the information the brain has about the world, you have to subtract the information the sensors provide. Looking at the thing your brain is to model is cheating: sensory information does not count as representation.

Once we defined how to quantify representations, we decided to evolve a brain to solve a task that could only be solved using representations. This task is a temporal-spatial-integration task, where a virtual agent has to catch small blocks thrown at it, and avoid the large ones at the same time. However, the sensors of this agent are set up in such a way that determining the size and direction of the blocks can only be ascertained by observing the blocks over multiple time steps.

We then used artificial neural networks (ANNs) and Markov networks to evolve controllers to solve this task. While there are no marked differences between those two implementations, we found agents evolve to solve this task quite well, and that we can indeed see a clear correlation between the agents’ ability to solve this task and the mathematical measure of representation we defined. I can’t possibly explain all the details and neat and specific findings here, and in any case you should read the paper and not just this summary. But I am going to elaborate on the implications of this work in the next paragraph.

What does this all mean? Instead of sitting down and programming a machine that solves a task that requires representations, we sat down and described the world we wanted the agent to function in. Evolution, in the end, took care of the rest. There was no need to define and code an internal model, the very thing that got AI research stuck in the 90s. The agents we used evolved this internal model and the necessary representations without us asking for it. Instead of avoiding representations, these agents just evolved the necessary computational infrastructure to form, and then use them: Intelligence WITH representations was the natural outcome of this process.

But I would like to go even further. We had a serious discussion with one of our referees of the paper, who insisted that subsumption architecture machines could solve the temporal-spatial-integration task we used in our paper just as well. And that, “clearly”, those subsumption architecture machines would have zero representation, so that as a consequence our measure must be wrong. We showed in one of our figures that this claim must be bogus. A purely reactive machine could be lucky and have a strategy that performs better than average, but it could never solve the task perfectly. To do this, you have to look at a block in the light of how you remembered it. And from this joining of memory and experience, you can plan.

This test has a profound implication: If a machine that uses subsumption architecture solves the task, we must register non-zero representation, indicating that the machine does in fact model processes internally. Just because you do not intend to use an internal model, and just because you claim not to use representations, doesn’t mean that they are not there. Subsumption architecture machines indeed have internal states and send information back and forth between computational subunits, which in turn could be used as representations. What I suggest is that Brooks’ machines might not have been intended to use representations, but when you apply our mathematical measures of representation, you may find them lurking here after all. The claim that you can have “Intelligence without Representations” really requires that you quantify representation to back up that claim. Historically, this never happened of course. Tentatively, it is becoming possible now, due to our work. What remains is this: We aren’t very good at programming internal models into machines, and even if you try your best to avoid it (like in the subsumption architecture), these seem to find a way to represent anyway, as does evolution.

Guest Blogger: Arend Hintze