Archetypes

and the maturation of AI

Pictured above is Sir Richard Owen’s conception of an archetypical vertebra. Not being a doctor myself, I’m not sure if it’s really the archetype, but that’s besides the point. What it represents is an ideal, a perfect version of a real thing from which aspects of that thing may be derived.

I’d like to illustrate just how an archetype differs from the instantiation of a thing. Let’s think of an apple — what color was it in your mind? Probably red, green or yellow. What color is an archetypical apple?

If we say red, then we risk opening arguments for why green apples aren’t actually apples at all. But if we say the archetype is all three of red, green and yellow, then we create a kind of absurdity, because this would imply that the instantiation of an apple should be all three colors.

What if the property of color is not part of the archetype? Does this imply that an apple would have no color at all? I think instead, it implies that the color is only a relevant property of instantiation, something that may help us living creatures decide whether a specific object is an apple, but not something so important that it must be abstracted away into the definitional archetype.

Then what is part of the definition of the archetypical apple? The shape? The taste? That it grows on a certain kind of tree? If we get scientific about it, we can talk about molecular structure, but this knowledge wasn’t necessary for ancient conceptions of an apple. We can usually distinguish between apple and pear by inspecting with our senses and brain alone, no need to dive into the microscopic. I think it makes more sense to classify an apple by multiple features; to say that an apple is red or green or yellow, and that the archetypical apple includes this or statement without preference for any of the values in particular.

What an archetype is, then, is a potential. A guess at the type of an object by the features, a guess subject to change. Just like AI, humans can over-fit or under-fit our data and make incorrect guesses — we’re just generally better at guessing at objects still (not all objects, but many) and can make guesses at arbitrary objects. Where AI taught to look at hammers is given a screwdriver and merely says “not a hammer”, humans can say “a screwdriver”. So we still have a leg up, for now, despite common pitfalls.

To explain over-fitting and under-fitting, think of a bunch of points on a graph, scattered and disordered. Some of them are red, some are blue, in roughly equal proportion; most of the blue are above most of the red. To “fit” the data, a line or curve must be drawn to separate points. In the above image, the green line represents over-fitting: it perfectly separates the red and blue…until a red or blue dot appears on the wrong side of the line! Then a whole new function has to be derived. On the other hand, if that black curve was simply a line, it would under-fit the data because too many reds or blues would be grouped in the wrong color.

What does this mean for archetypes? In the case of over-fitting, it can mean that we have a very specific idea of an object. Like the green curve, a psychological over-fitting sets a clear, if complex, boundary between ideas or objects. This complexity can turn into a sort of rigidity, which can show up in unexpected ways. Take a psychological example, functional fixedness, where a person can’t think of uses of an object beyond the ways they were already shown. This inhibits some kinds of problem solving and creativity; or if we want to get socially aware, over-fitting people can lead to rigid gender roles. In the case of under-fitting, think of a child who only refers to animals as “doggy”. She sees a cat, and says “hi doggy”, sees a tiger “look at the doggy, mama!” — in this case the broken clock is right less than twice a day. Fortunately, people naturally grow out of under-fitting, but unfortunately we can get stuck over-fitting.

When training AI, humans have to be very aware of under- and over-fitting not just from the perspective of getting the models right, but also because these archetypical models will form a deep set of knowledge that the AI will have to pull from. I’m not even making a claim about a self-aware, conscious AI, or even a super-intelligence. Think about a lawyer-AI that under-fits cases, thinking that all property is the same, when legally it might not be; or a doctor-AI that over-fits for certain bacterial infections and misses a new mutation. These kinds of errors will only compound as AI gets more robust and builds off of the models people are making now. If a general AI is to ever come about, it will need a lot of models to reference simultaneously, and those models need to be accurate not just in the moment, but to new incoming data as well.