From museum visitors feeling compelled to touch statues that they can see, to the biblical account of the incredulous Thomas who would not accept that Jesus was alive unless he could touch him, tactile ‘fact-checking’ is frequent. Similarly, in the clinical domain, the empirical literature shows that individuals with obsessive compulsive disorder are prone to check things by touch rather than sight1,2. Among other factors underlying these complex behaviours, we suggest that the privilege of touch might come from it carrying more evidential weight than seeing particularly when there is ambiguity3. To test this hypothesis, we compared the confidence that observers put in their perceptual decisions after either seeing or touching stimuli that gave rise to a geometric illusion known as the Vertical-Horizontal (VH) illusion (Figs 1, 2a,b). This illusion is known to produce similar perceptual effects in the visual and in the tactile domains4,5,6.

Figure 1 The Vertical-Horizontal Illusion (a–c) showing Inverted T stimuli were explored by touch and by vision. The vertical bar ranged from 18 to 34 mm, which was compared to a horizontal bar of fixed size of 30 mm. Stimuli could be clear-cut (a,c) or ambiguous (b) in the two modalities. Perceptual responses were verbal reports as to whether the vertical bar was ‘shorter’ or ‘longer’ than the horizontal bar. Confidence judgements were given on a scale from very uncertain (1) to very certain (7). (a) Probability of judging the vertical bar to be longer than the horizontal bar as a function of the size of the vertical bar. Error bars show within-participant standard errors. (b) Point of subjective equality (PSE) for each participant in each modality. Full size image

Figure 2 Stimuli and hypotheses (a) Testing conditions. (b) Stimulus designed to be clearly seen and felt. Hypothetical confidence profiles as a function of the difference from the PSE. (c) Possible confidence profiles: Common currency, global overconfidence in touch, global overconfidence in vision, greater confidence in touch under ambiguity, greater confidence in vision under ambiguity. Full size image

The belief that touch provides more certainty than other senses, especially vision, has a solid historical background, but to our knowledge has not been directly tested, except with affordances7. Descartes, a sceptic toward all sensory evidence, highlighted that “Of all our senses, touch is the one considered least deceptive and the most secure”8 while Johnson, in response to Berkeley’s immaterialism9, considered that touch demonstrated the existence of an external world in a way that no other sense would (see also de Condillac for a similar claim10). The idea is that touch, more than vision, provides evidence for the reality of external objects11,12 and conveys a higher sense of directness and certainty13,14.

When it comes to providing evidence about certain features rather than the existence of objects, however, it is implausible that touch provides more objective or more accurate information than vision, since the relative accuracy of the two modalities depends critically on the task and on the context. There is a more sensible way of understanding the superiority of touch in this context: For equal accuracy, people might place more confidence in a decision reached by touch rather than vision. This hypothesis is congruent with evidence that people are more likely to purchase an item if they can touch it rather than if they simply look at it15,16, that some people are anxious when interacting with graphical user interfaces that display objects that cannot be touched17,18.

The study of confidence falls within the field of metacognition, i.e. how the cognitive system assesses and monitors its own states19,20,21. Studies comparing perceptual confidence across sensory modalities have been conducted previously but for tasks where each modality could not be directly compared, i.e. in a brightness versus pitch discrimination22, or orientation versus pitch discrimination. According to a widespread account of perceptual metacognition, a central purpose of explicit judgements of confidence is to allow the reliability of perception across different decisions to be compared, and appropriate trust to be placed in each percept accordingly.

The idea that confidence operates as a common currency at a given time across sensory modalities such as vision and audition has been directly tested by showing that people can determine which of two decisions should be trusted more both in the same modality as well as in different modalities23,24. By extension, the common currency model may extend through different decisional times by assuming that performance is mapped onto confidence in an identical manner across modalities. Decisions reached through different channels or in different contexts could then be compared. An ideal observer should then be able to decide which of two independent decisions to trust more, based on these comparable confidence ratings. This is this common mapping assumption that we tested across touch and vision.

If an observer makes correct decisions about three times out of four, both in vision and touch, and if confidence follows a common mapping, we would expect the observer to report the same confidence in her decisions, no matter what modality is used to arrive at a judgment, so long as the probable correctness of her decisions remains the same. If she was only correct two times out of four when relying on touch, and three out of four when relying on vision, she should report lower confidence for touch than vision. These two ratings would mean that she should choose to rely on vision, rather than touch. In other words, for confidence to be comparable between decisions, observers’ confidence ratings are expected to track the probability of their response being correct for a given stimulus, regardless of the modality used: they should express similar confidence when making decisions similarly likely to be correct, and different confidence when differently likely to be correct. Though individuals differ in their mappings from correctness to confidence, both in bias and sensitivity, a common mapping is indeed observed in the same individual across independent tasks and comparable sensitivities22. It is then likely to subserve the use of confidence as a common currency in direct comparisons.

There are several of ways in which the behavior of observers could depart from these assumptions (Fig. 2c). They could simply be over- or under-confident in one modality, causing them to rely on the corresponding sense more than they ought to. Alternatively, metacognitive sensitivity could differ between modalities as a result of the mappings between the perceptual accuracy (probability of being correct) and reported confidence.

Until now, few studies have compared perceptual metacognition in a task which can be performed by different sensory modalities25. While Fitzpatrick and colleagues looked at whether vision and touch would provide similar confidence in perceiving affordances for action, the task did not offer a fair comparison between the two modalities as a tool needed to be used for haptic exploration7. Furthermore, the perception of possibilities for action did not allow for an assessment of how confidence tracked accuracy. Here we chose to focus on size estimation, as providing a more straightforward domain in which to assess possible differences between tactile and visual confidence. We frequently use both vision and touch to estimate and compare the size of objects, and both senses are well suited to the task. Here, we focus on cases where the observer must decide between two options when sensory evidence is ambiguous (equivocal for both options). We investigated how size estimation in both vision and touch was affected by the robust Vertical-Horizontal illusion4,26,27,28 where a vertical bar appears to be longer than an adjoining horizontal bar of same length (see Fig. 1a–c). Studying metacognition across modalities using this task provides an intriguing opportunity to explore how confidence relates to observers’ subjective representations of the stimuli they perceive and follows a common mapping across two distinct tasks.

Observers explored the stimuli by vision and by touch in two separate testing blocks (Fig. 2a). In each case, they reported whether the vertical bar seemed to be longer or shorter than the horizontal bar. They were then asked to report how confident they were of their choice. Some stimuli were close to the point of subjective equality (PSE), such that the two bars appeared to be of the same size. This point corresponds to the case, discussed above, where there is equivocal evidence for both responses. Across both modalities, this was the case when the horizontal bar was approximately 25% longer than the vertical29,30. Accounting for this bias, our stimulus set varied the objective size of the vertical bar to include clear-cut cases where it was easy to determine whether the vertical size was shorter or longer than the horizontal one, even under the influence of the illusion. The stimulus set also included ambiguous cases where the illusion caused bars of different sizes to appear to have similar sizes, see Fig. 1. We expected that perceptual decisions in ambiguous cases, i.e. closer to the PSE, would be associated with lower confidence ratings for both modalities.

By pitting the two senses against one another in the fairest conditions possible over a range of ambiguous and clear-cut cases, this experiment could make a distinction between possible types of modality-related biases, sketched in Fig. 2c. If the common mapping assumption is correct, confidence should track perceived stimulus ambiguity in the same way across modalities. Alternatively, greater confidence might be placed in one or other modality overall, or, selectively, greater confidence in one or other modality under situations of certainty or uncertainty, suggesting a need to qualify the idea of a similar mapping. This in turns has broader implications for the generalization of common currency accounts, and whether confidence is assessed according to a similar and consistent metric across modalities, either when the decisions are compared immediately at the time of performance, or more generally for later purposes.