An interdisciplinary team of researchers at IBM have presented at paper at the SC09 supercomputing conference describing a milestone in cognitive computing: the group's massively parallel cortical simulator, C2, now has the ability to simulate a brain with about 4.5 percent the cerebral cortex capacity of a human brain, and significantly more brain capacity than a cat.

No, this isn't yet another example of Kurzweil-style guesstimating about how many "terabytes" of storage a human brain has. Rather, the authors quantify brain capacity in terms of numbers of neurons and synapses. The simulator, which runs on the Dawn Blue Gene /P supercomputer with 147,456 CPUs and 144TB of main memory, simulates the activity of 1.617 billion neurons connected in a network of 8.87 trillion synapses. The model doesn't yet run at real time, but it does simulate a number of aspects of real-world neuronal interactions, and the neurons are organized with the same kinds of groupings and specializations as a mammalian cortex. In other words, this is a virtual mammalian brain (or at least part of one) inside a computer, and the simulation is good enough that the team is already starting to bump up against some of the philosophical issues raised about such models by cognitive scientists over the past decades.

In a nutshell, when a simulation of a complex phenomenon (brains, weather systems) reaches a certain level of fidelity, it becomes just as difficult to figure out what's actually going on in the model—how it's organized, or how it will respond to a set of inputs—as it is to answer the same questions about a live version of the phenomenon that the simulation is modeling. So building a highly accurate simulation of a complex, nondeterministic system doesn't mean that you'll immediately understand how that system works—it just means that instead of having one thing you don't understand (at whatever level of abstraction), you now have two things you don't understand: the real system, and a simulation of the system that has all of the complexities of the original plus an additional layer of complexity associated with the models implementation in hardware and software.

The more faithful the simulation gets, the bigger an issue this becomes. The researchers allude to it in section 3.2.2 of the paper, when they describe a measurement tool they call the "BrainCam."

"When combined with the mammalian-scale models now possible with C2," they write, "the flood of data can be overwhelming from a computational (for example, the total amount of data can be many terabytes) and human perspective (the visualization of the data can be too large or too detailed)."

The problem described above doesn't mean that accurate simulations are worthless, however. You can poke, prod, and dissect a brain simulation without any of the ethical or logistical challenges that arise from doing similar work on a real brain. The IBM researchers endowed the model with checkpoint-based state-saving capabilities, so that the simulation can be rewound to certain states and then moved forward again under different conditions. They also have the facility for generating MPG movies of different aspects of the virtual brain in operation, movies that you could also generate by measuring an animal's brain but at much lower resolutions. There's even a virtual EKG, which lets the researchers validate the model by comparing it to EKGs from real brains.

In the end, C2 is like having a (sorta) real cortex that you don't fully understand, but that you can rewind, snap pictures of, and generally measure under different conditions so that you can do experiments on it that wouldn't be possible (or ethical) with real brains.

Scaling and the singularity

One of the major results from the paper is that C2 exhibits "weak scaling." In other words, as the total amount of memory in the model scales, the number of neurons and synapses that can be simulated scales roughly linearly, also. This is important, because it means that a future version of Blue Gene with two or three orders of magnitude more memory (and associated bandwidth and processing power) will be able to simulate an entire human brain.

The model also exhibits "strong scaling," which means that increases in the amount of memory per CPU enable them to run the model faster, so that it will eventually be able to simulate a cortex in real time.