Sensation, perception and computation

There’s often seen to be a fight between symbolic AI and artificial neural networks (ANNs). The difference is between either modeling either within the grammar of a language, or through training of a network of connections between cells. Both approaches have pros and cons, and you generally pick the approach that you think will serve you best. If you’re writing a database backed website you’ll probably use symbolic computation in general, although it’s possible that you’ll use an ANN in something like a recommendation system.

There is a third approach though, one I’ve fallen in love with and which unifies the other two. It’s really simple, too — it’s geometry. Of course people use geometry in their software all the time, but the point is that if you see geometry as a way of modeling things, distinct from symbols and networks, then everything becomes beautiful and simple and unified. Well, maybe a little.

Here’s an example. I’m eating my lunch, and take a bite. Thousands of sensors on my tongue, my mouth and my nose measure various specialised properties of the food. Each sensor contributes its own dimension to the data sent towards the brain. This is mixed in with information from other modalities — for example sight and sound are also known to influence taste. You end up having to process tens of thousands of data measurements, producing datapoints existing in tens of thousands of dimensions. Ouch.

Somehow all these dimensions are boiled down into just a few dimensions, e.g. bitterness, saltiness, sweetness, sourness, sweetness and umami. This is where models such as artificial neural networks thrive, in constructing low dimensional perception out of high dimensional mess.

The boiled-down dimensions of bitterness and saltiness exist in low dimensional geometry, where distance has meaning as dissimilarity. For example it’s easy to imagine placing a bunch of foods along a saltiness scale, and comparing them accordingly. This makes perfect sense — we know olives are saltier than satsumas not because we’ve learned and stored that as a symbolic relation, but because we’ve experienced their taste in the geometrical space of perception, and can compare our memories of the foods within that space (percepts as concepts, aha!).

So that’s the jump from the high dimensional jumble of a neural network to a low dimensional, meaningful space of geometry. The next jump is via shape. We can say a particular kind of taste exists as a shape in low dimensional space. For example the archetypal taste of apple is the combination of particular sweetness, sourness, saltiness etc. Some apples are sharper than others, and so you get a range of values along each such dimension accordingly, forming a shape in that geometry.

So there we have it — three ways of representing an apple, either symbolically with the word “apple”, as a taste within the geometry of perception, or in the high dimensional jumble of sensory input. These are complimentary levels of representation — if we want to remember to buy an apple we’ll just write down the word, and if we want to compare two apples we’ll do it using a geometrical dimension — “this apple is a bit sweeter than that one”.

Well I think I’m treading a tightrope here between stating the obvious and being completely nonsensical, I’d be interested in hearing which way you think I’m falling. But I think this stuff is somehow really important for programmers to think about — how does your symbolic computation relate to the geometry of perception? I’ll try to relate this to computer music in a later blog post…

If you want to read more about this way of representing things, then please read Conceptual Spaces by Peter Gärdenfors, an excellent book which has much more detail than the summary here…