Feldman et al's Bayesian model of categorical perception22 has been extended to account for differential perceptual distortion across multiple cues and the enhanced model confirms that localized perceptual tension can indeed arise from differences in the distributions associated with such cues. In particular, the model reveals that cue conflicts can be manifest as variations in the means and/or variances of their associated distributions or, more interestingly, from unequal levels of uncertainty associated with observing the different perceptual cues. The latter is a particularly compelling result, since it indicates that perceptual tension can arise when the reliability of information derived from alternative cues to category membership is not balanced across different observation dimensions. For example, a humanoid robot might appear to be fully human from the cues provided by the overall facial features, but small anomolous movements in the eyes might be sufficient to increase the uncertainty associated with the category membership of that particular cue, thereby giving rise to perceptual tension (and feelings of discomfort) in the viewer.

The model shows that, in order to obtain Mori's basic response curve (as illustrated in Fig. 1), it is necessary to posit a category representing a ‘target’ perception (e.g. human) with the mean of its distribution at one end of the stimulus continuum. Then, in order for categorical perception (and the associated distortion of perceptual space) to occur, it is necessary to posit a second category representing a ‘background’ perception (e.g. non-human) whose distribution overlaps that of the target. The model also shows that in order to preserve the more or less monotonic property of the basic response curve (i.e. a rising function that depicts low familiarity at low humanlikeness and high familiarity at high humanlikeness), the distribution for the background needs to be broader than that for the target – an intuitively satisfactory outcome (see Fig. 2a). The model shows that, if the overlap between the target and background categories is reduced, a dip in ‘familiarity’ can be observed at the class boundary (see Fig. 2b). This dip reflects a degree of unfamiliarity (and hence unpredictability) associated with the stimuli around the category boundary. However, such a dip cannot go negative (since the curve represents probability) and does not in itself represent uncannyness. In fact, this intermediate result does indeed capture the concept of ‘familiarity’ but, crucially, not Mori’s notion of ‘affinity’.

Figure 2 Probability of occurrence of different stimuli given a broad ‘background’ category and a narrower ‘target’ category. a, A large overlap between target and background categories gives rise to a monotonic relationship between the value of a stimulus (horizontal axis) and the probability of occurrence of that stimulus (vertical axis). b, A smaller overlap between categories gives rise to a non-monotonic relationship. Full size image

Hence, the model reveals that there are two key variables that relate to Mori's vertical ‘affinity’ axis: (i) the overall probability of occurrence of a particular stimulus and (ii) any perceptual tension that might arise from conflicting perceptual cues. Not only does this approach lead to the successful prediction of the uncanny valley response curves, it also provides an explanation for the confusion over the nomenclature for Mori's verical axis (as described above). In the model presented here, ‘familiarity’ is defined mathematically as the probability of occurrence of a stimulus, whereas ‘affinity’ (i.e. Mori's vertical axis) is defined as a function of both ‘familiarity’ and ‘perceptual tension’. In particular, it has been found that simply subtracting a weighted measure of perceptual tension from the probability of occurrence of a stimulus predicts the appropriate behaviors rather well. Interestingly, such a weighting factor effectively corresponds to the sensitivity of an observer to any perceived perceptual conflict. If the weighting factor is small or zero, then the implication is that the observer does not notice (or does not care) if perceptual cues are in conflict. If the weighting factor is large, then it indicates a strong sensitivity to differential cues on the part of an observer. The weighting is thus a key property of an observer, not of a stimulus.

As an illustration of the output of the model, Fig. 3 shows how varying the differential uncertainty associated with cues along two perceptual dimensions (for the distributions illustrated in Fig. 2a) gives rise to different levels of localized perceptual tension (Fig. 3a) and hence to different curves for affinity/eeriness (Fig. 3b). As can be seen, increasing the differential degree of uncertainty between the two cues leads to an increase in perceptual tension and a decrease in the affinity function near the category boundary, with the highest level of differential uncertainty leading to negative affinity. Clearly the shapes of these curves are remarkably similar to those illustrated in Fig. 1 and the affinity measure does indeed appear to correspond to the notion of uncannyness as originally proposed by Mori.

Figure 3 Differential distortions arising from conflicting perceptual cues. a, Perceptual ‘tension’ increases at the category boundary as a function of differences in the uncertainty associated with different perceptual cues. The degree of tension is proportional to the amount of differential distortion. b, Peaks in perceptual tension give rise to dips in ‘uncannyness’. The depth of the dip is determined by the degree of perceptual tension and the sensitivity of an observer to any perceived perceptual conflict k. In this illustration, k is fixed at a non-zero value. Full size image

As mentioned above, the other key aspect of Mori's original uncanny valley hypothesis was that a moving humanlike artifact could be perceived as being more uncanny than the corresponding still humanlike artifact. Such a difference may be modeled in a number of different ways, but perhaps the simplest method is to regard a moving artifact as providing clearer information about its category membership, i.e. the distributions associated with a moving target category would be sharper (i.e. have lower variance) than those for a still target category. The output of the model for such a situation is shown in Fig. 4. With all of the other parameters held constant, a decrease in the variance for the target category leads to higher values of affinity either side of the category boundary and a deeper negative-going dip, precisely as predicted by Mori.