Positivity bias effect

In the first experiment, 420 Mechanical Turk participants were presented with 96 photographs modified in one of the following ways: (1) photographs were kept in their Original format (400 × 400 pixels); (2) photographs were Blurred through the application of a 15 pixels radius Gaussian filter; (3) photographs had only One-third (the left side) of the faces visible; and (4) photographs were reduced to a Small size (50 × 50 pixels; see Fig. 1A).

Figure 1 Examples of stimuli used in Experiments 1 and 2. Examples of the four manipulations used in experiment one (Original, Blurred, One-third, and Small versions-A) and experiment two (Original, Incomplete, Half, and Mirror–reversed-B). To satisfy the copyright policies of the journal, in this illustration we use an artificially generated face from the website https://www.thispersondoesnotexist.com, which uses generative adversarial networks or GANs (credited to Nvidia Corporation). However, in the experiments, we used real human faces from the website https://www.facity.com. Full size image

All the participants were presented with the same 96 faces but they were randomly assignment to one of the four modifications. In each condition, for each photograph, participants were asked to judge how physically attractive, warm, and knowledgeable (always in this order) the people portrayed in the 96 photographs were. To give their answers, participants used a scale ranging from 1 (not at all) to 10 (very much). The response was self-paced and the mouse was used to indicate the corresponding number on the scale. At the end of the experiment participants, completed a mood scale – the Positive and Negative Affect Scale21.

For each of the 96 target-faces, the responses were aggregated across participants. To test if there is a positivity bias across the different modifications (Small, Blurred, and One-third) and different judgements (attractiveness, warmness, and knowledgeableness), we conducted a repeated measures ANOVA, using the average ratings of the faces as the dependent variable, and the type of judgement (attractiveness versus warmness vs. knowledgeableness) and the type of modification of the photograph (Original vs. One-third vs. Blurred vs. Small) as the two independent variables. The significant interaction found between the two independent variables, F(6, 90) = 55.07, p < 0.001, suggests that the three judgements were differently affected by the modification manipulation (see Table 1 for descriptive statistics).

Table 1 Descriptive statistics for all experiments. Full size table

Figure 2 illustrates the positivity bias found for the One-third condition. In this condition, incomplete faces were rated—on average—almost an entire point higher on the ten points scale than their respective original versions (M difference = 0.92, p < 0.001). In the figure, we plotted the difference between the ratings in each of the three incomplete conditions and the ratings in the Original condition as our measure of attractiveness bias. The figure also shows that the bias is as large as two points on the scale in the strongest cases and non-existent in a handful of cases.

Figure 2 Positivity bias found in Experiment 1. The ratings for the Original faces (x axis) are plotted against the magnitude of the bias (y axis). Each dot represents one of the 96 faces. Full size image

Participants also rated faces as less attractive in the Original condition than in the Small conditions, M difference = 0.25, p < 0.001, or in the Blurred condition, M difference = 0.46, p < 0.001. Among the three conditions, the One-third condition led to the largest positivity bias and the Small modification led to the smallest bias.

For warmness and knowledgeableness judgements, a negativity bias was found in the Small and Blurred conditions, since the ratings were larger for the Original faces than for the Small faces (warmness: M difference = −0.18, p < 0.001; knowledgeableness: M difference = −0.10, p = 0.008) or the Blurred faces (warmness: M difference = −0.10, p = 0.024; knowledgeableness: M difference = −0.18, p < 0.001). However, the ratings in the One-third condition were larger than in the Original condition, (warmness: M difference = 0.38, p < 0.001; knowledgeableness: M difference = 0.41, p < 0.001), meaning that the positivity bias found for attractiveness generalizes to warmness and knowledgeableness in this case.

Although we know, from previous studies, that men and women usually agree on attractiveness evaluations12, we asked whether the positivity bias is stronger for male or female participants and whether it is affected by the gender of the person being evaluated. To answer this question, we calculated the average attractiveness ratings provided by male and female participants to faces of women and men in the Original and One-third conditions. We found a small but significant interaction between the modification of the face and the gender of the participants, F(1,94) = 4.87, p = 0.03. This interaction suggests that male participants exhibit a slightly stronger positivity bias, M difference = 0.95, p < 0.001, than female participants, M difference = 0.87, p < 0.001. No effect of the gender of the face being evaluated was found.

We also compared the scores on the mood scale for the four conditions to assure that the differences found are not due to differences in the participants’ mood. One could argue that the effect could be a consequence of participants in the incomplete conditions enjoying more the task which could lead to more positive evaluations of the faces. Such an argument is consistent with the literature that shows hedonic states following interruptions or uncertain situations22,23.

Two mixed effects ANOVAs were conducted, with the modification being the independent variable and the ratings to the Positive and the Negative Affect Scales being the two dependent variables. For the Positive Scale, there was no significant effect of the type of modification, F(3, 413) = 0.853, p = 0.466, and the same is true for the Negative Scale, F(3, 416) = 0.691, p = 0.588. This result suggests that there is no reason to believe that the incompleteness of the photographs led to differences in participants’ mood.

The results of this first experiment support our hypothesis that people are positively biased when judging other people’s facial attractiveness under information shortage. Yet, this first experiment has limitations. The Blurred and the Small versions are likely to lead to objectively more attractive faces since facial imperfections, such as pimples or wrinkles, are less visible. In Experiment 2 we try to overcome this limitation by creating a new incomplete version of the photographs in which groups of pixels are eliminated at random.

291 Mechanical Turk workers took part in the second experiment. To create the material for the new incomplete condition, we divided each original photograph (400 × 400 pixels) in 400 squares of 20 by 20 pixels each and eliminated randomly a set of 150 squares from the total of 400 squares (this modification will be called Incomplete from now on). This process was repeated 100 times for each face. Two other versions were created for this experiment: Half-faces (as opposed to the One-third from Experiment 1) and Mirror-reversed symmetric faces. For the Half-face condition, as the name indicates, we cut the faces in two halves. This was done by using the equidistant point between the eyes, the central axis of the nose, and the upper lip as references. Additionally, for each face, we used these halves to create symmetric faces by combining one half face with its mirror-reversed version (see Fig. 1B for an example).

Participants were assigned to one of four conditions: Original, Half, Mirror-reversed, and Incomplete. In the Incomplete condition, for each face (and individually for each participant), an incomplete version of the face was drawn at random from the set of 100 different incomplete versions. This procedure ensures that the obtained results are not an artifact of occluding a specific facial feature in the incomplete version, because the features shown or hidden vary at random across participants. This time, participants made only attractiveness judgements and, for that, they used a scale ranging from zero (very unattractive) to 100 (very attractive).

Again, we conducted a repeated measures ANOVA to test for differences across the multiple conditions (Original vs. Incomplete vs. Half vs. Mirror-reversed). We found a main effect of modification, F(3, 93) = 243.17, p < 0.001, meaning the attractiveness ratings varied significantly across conditions. The Original faces received lower ratings than Half-faces, M difference = 2.05, p < 0.001, and Incomplete faces, M difference = 2.91, p < 0.001, meaning the positivity bias was replicated for these new incomplete conditions. Perfectly symmetric faces, on the other hand, received ratings that were significantly lower than their Original (M difference = −10.12, p < 0.001; see Fig. 3A) and their Half-face counterparts (M difference = −12.65, p < 0.001). The fact that participants rated differently perfectly symmetric faces and half-faces suggests that the process taking place in the Half-face condition is probably not based on inferring perfect symmetry (inferring the missing half from the half provided; see Table 2 for means and standard deviations).

Figure 3 Examples of the stimuli and the manipulations in Experiment 7. Full size image

Table 2 Description of the sample in each experiment. Full size table

Specific to human faces

In the third experiment we used photographs of dog faces, flowers, and landscapes to test whether the positivity bias observed in Experiments 1 and 2 is also observed in these categories or whether it is specific to human faces. Dog faces are especially relevant because they are structurally similar to human faces in the sense that they have similar elements (eyes, nose, and mouth).

We had 28 photographs for each of the three categories (dogs, flowers, and landscapes) and we also generated 100 incomplete versions for each photograph through a procedure equivalent to the one used in Experiment 2. Dog faces were collected from Google using the key words: “dog faces on white background”. The landscapes and flowers were collected from McGill Calibrated Color Image Database24. The photos were then cropped to preserve only the area of interest (the face for the dogs and the flower for the plants). The photographs were centered and resize to 350 by 350 pixels.

207 Mechanical Turk participants were assigned to one of two conditions: Original or Incomplete photographs. For dog faces, participants were asked “how cute is the dog?”, for flowers “how beautiful is the flower?”, and for landscapes “how attractive is the scenery?”. All participants rated the dogs, the flowers, and the landscapes, in blocks. The orders of the blocks and the photographs within each block were randomized for each participant. To give their answers, participants rated the photographs on a scale from zero (not at all) to 100 (very much).

A mixed effects ANOVA revealed an interaction between the category of the stimulus and the modification, F(2, 81) = 3.73, p = 0.028, indicating that the bias was different for the three categories. For dog faces, the ratings given to the Incomplete photographs were lower than the ratings given to the Original photographs (M difference = −4.02, p < 0.001) and a similar negativity bias was detected for flowers (M difference = −3.80, p < 0.001). No bias was found for landscapes (M difference = −1.18, p = 0.151). These results show that the positivity bias found for human faces does not generalize to dog faces, landscapes, and flowers. This result also agrees with past research, including Sear’s seminal paper9 about person-positivity bias, where the author argues that stimuli are evaluated more favorably the more they resemble individual human beings.

Sensitivity to expectation

In the fourth experiment we measure whether the positivity bias is sensitive to the perceiver’s expectation regarding the target-faces that are being evaluated. If the positivity bias occurs due to positive expectations in the incomplete condition, by telling participants that other participants evaluated the target-faces as highly attractive should enlarge the positive expectations in incomplete photographs and increase the effect. Similarly, telling participants that the target-faces were previously evaluated by others as less attractive should decrease the use of positive expectations and thus disrupt the effect.

424 Mechanical Turk participants evaluated photographs either in the Original or the Incomplete condition (with the random elimination of pixels as described in Experiment 2). The expectation manipulation consisted of three levels: High-Expectation, No-Expectation, and Low-Expectation. In the No-Expectation condition, no information was given regarding the beauty of the target. In the other two conditions, participants were told that only faces rated as above average (or below average) by other workers would be presented to them. Participants were assigned to one of six conditions: Incomplete or Original faces, with high, low, or no expectations.

The repeated measures ANOVA suggests that the effect of expectation was significant, F(2, 94) = 484.01, p < 0.001, meaning that the ratings are overall higher in the High-Expectation condition (M High-Expectation = 49.01, SD High-Expectation = 11.36) than in the No-Expectation condition (M No-Expectation = 44.74, SDNo-Expectation = 10.94), M difference = 4.27, p < 0.001, and they are higher in the No-Expectation condition in comparison to the Low-Expectation condition, M difference = 0.74, p < 0.001 (M Low-Expectation = 44.00, SD Low-Expectation = 10.94). These results suggest that participants’ judgements were sensitive to the expectation manipulation. The positivity bias was also replicated in this experiment. It was the strongest in the No-Expectation condition, M difference = 7.08, p < 0.001, reduced in the High-Expectation condition, M difference = 3.00, p < 0.001, and reduced even further in the Low-Expectation condition, M difference = 1.97, p < 0.001. The differences in positivity bias across conditions were also significant (M difference between no-expectation and high-expectation = 4.09, p < 0.001, and M difference between high-expectation and low-expectation = 1.03, p = 0.013).

These results show that positive expectations, while increasing the overall evaluations of the faces, do not increase the bias, instead they decrease the bias. Low expectations also did not eliminate the effect, only reduced it. Hence, we conclude that expectations are not the main explaining mechanism underlying the positivity bias.

This procedure of priming expectations also reduces the ambiguity that is experienced by participants in the incomplete condition and that might have contributed to the reduction of the bias. Reducing ambiguity is expected to reduce the effect (i.e., the difference between the Incomplete and the Original faces) through a recalibration of the ratings towards the expectation induced. Our rationale is that in the condition with no-expectation, no external information is given about the attractiveness of the targets and thus, the magnitude of the bias can be freely expressed in participants’ evaluations. In other words, expectations restricted the amplitude within which the cognitive bias is operating period.

Ruling out similarity

In the fifth experiment we test the hypothesis of whether similarity to the self could be the mechanism underlying the positivity bias. Similarity has been shown to account for positivity biases towards others in some contexts; such is the case of the research conducted by Sear9 and Norton et al.8. When the information about a target is ambiguous or incomplete, people erroneously perceive the targets as more similar to themselves, causing an increase in liking. If a similar mechanism is happening in the condition with incomplete faces, then we should observe higher ratings of perceived similarity in the incomplete than in the original photographs.

223 Mechanical Turk participants evaluated the 96 faces after being assigned to one of two conditions: Original or Incomplete condition. For each photograph they were instructed to indicate how similar is the person’s face to their own. To give their answers, participants rated the photographs on a scale from zero (not similar at all) to 100 (very similar).

The similarity ratings for faces in the incomplete condition were not significantly different (M = 34.06, SD = 2.75) from the ratings of the original photographs (M = 33.80, SD = 4.14), t(95) = 1.11, p = 0.271. Although, this conclusion is based on a null effect, the result suggests that the two conditions do not vary in how similar participants rate the targets to the self.

The role of typicality

When presented with incomplete information, people infer the missing pieces based on a combination of contextual inputs and knowledge from similar past experiences. When reconstructing information regarding an acquaintance, people can fill in the blanks with memories of past interactions with that person. But, how do people fill in the missing information of a stranger that they meet for the first time? In such situations, the inference will rely on a more general visual representation. One possibility is that this representation is a typical face that people have stored in their memories as a result of their extensive exposure to human faces. If that is the case, since average/typical faces are perceived to be more attractive25,26, the resulting inference will reflect a positivity bias (the incomplete faces will be perceived as more attractive than the complete faces).

If typicality does play a role in the positivity bias, then the magnitude of the positivity bias (i.e., the differences in the attractiveness ratings between original and incomplete photographs) is expected to be larger for atypical faces, since they are being completed based on a more attractive typical internal representation, than for incomplete typical faces, for which the rating will be more similar to attractiveness ratings attributed to the original versions. In other words, by completing the missing information of the incomplete untypical faces based on a prototypical representation, participants are sourcing elements from a face that is known to be on average more attractive. Thus, in the sixth experiment we explore the role of typicality in the positivity bias.

145 Mechanical Turk participants were asked to rate the typicality/distinctiveness of the 96 original photographs used in the previous experiments. The photographs were paired with the question “How much does this face deviate from a typical face?” Participants provided their answer on a scale from zero (does not deviate at all) to 100 (deviates very much). Lower rating on this scale mean the face is considered more typical.

These ratings were then used to investigate the positivity bias in typical versus untypical faces, which we did by comparing the perceived attractiveness of the original versus the incomplete faces given their typicality level.

We used the median of the distinctiveness ratings to split the faces into two groups: typical and atypical. These groups were used as an independent variable together with the modification (original versus incomplete photograph) and the experiment (Experiments 2 and 4) in a mixed effects ANOVA. The dependent variable was the attractiveness ratings of the 96 target photographs. In this analysis, we used the attractiveness ratings of the original and incomplete faces from the experiments 2 and 4. These were experiments with similar design and identical modification of the photographs (from Experiment 2 only the original and the incomplete conditions were used and from Experiment 4 only the no-expectation condition was included in the analysis).

A significant effect of modification was found, F(1,94) = 130.869, p < 0.001, with higher attractiveness ratings for the incomplete (M = 50.07, SD = 10.46) than for the original photographs (M = 46.35, SD = 11.36). This result replicates the patterns found in previous experiments. A strong interaction between the modification of the faces and the typicality variable was also observed, F(1, 94) = 23.00, p < 0.001. As expected, a larger positivity bias was found for the atypical faces (M difference = 5.416, p < 0.001) than for typical faces (M difference = 2.216, p < 0.001). We also conducted a partial correlation between typicality and the attractiveness of the incomplete faces while controlling for the attractiveness of the original faces. A significant moderate correlation was found, r(93) = −0.407, n = 96, p < 0.001. These results are indicative of the role of typicality in the positivity bias effect.

Disrupting the positivity effect

In the seventh and last experiment, we test whether the positive bias can be disrupted. There is evidence in the literature that judgments of facial attractiveness rely on holistic representations of human faces27. Thus, we hypothesized that the positivity bias found for attractiveness judgements of incomplete faces will also depend on holistic processing. If this is true, then, we should be able to disrupt the positivity bias by disrupting the holistic processing of faces. Inverted (up-side-down) faces have been shown to disrupt holistic processing28,29 (but see30), so we created conditions with inverted faces to test this hypothesis. Moreover, disrupting the holistic processing is known to affect other types of face processing tasks such as face recognition31, race categorization32, and emotional expression recognition33, among others. One possibility is that, by disrupting the holistic processing of the target-faces, participants are less successful in using the typical face to fill in the missing information, and as such, the positivity effect will not be observed anymore. In agreement with this hypothesis, judgements of distinctiveness or typicality were shown to be highly affected when the faces are inverted34. On the same note, Dimond and Carey35 proposed in 1989 that with experience, people develop fine-tuned prototypes of faces (or any other stimuli as long as a certain level of expertise is reached) that help them to encode configurational information in faces. If that is the case, then inverting the faces might disrupt the use of this prototypical spatial configuration.

422 Mechanical Turk participants took part in this experiment. The material was the same material as in the previous experiment (Original and randomly generated Incomplete versions) plus four additional versions of the 96 faces: photographs rotated 90 degrees clockwise and their corresponding Incomplete versions (100 randomly incomplete photographs for each rotated face), and 96 Inverted photographs (180 degrees rotation) and their corresponding Incomplete versions (see Fig. 3B for an example).

Participants judged the attractiveness of the faces on a scale from zero (not attractive at all) to 100 (very attractive). The two independent variables in this experiment were the modification with two levels (Original vs. Incomplete) and the orientation of the photographs with three levels (Upright vs. 90-degree-rotated vs. Inverted).

The interaction found between modification and rotation, F(2, 94) = 36.45, p = 0.028, reflects the presence of the positivity bias for the Upright photographs (M difference = 3.54, p < 0.001), and the lack of bias for the 90-degree rotated (M difference = 0.31, p = 0.511) and Inverted photographs (M difference = −0.02, p = 0.968; see Fig. 4).

Figure 4 Positivity bias in Experiment 7. The positivity bias in the Upright condition (A) and the absence of the bias in the Inverted condition (B). The histograms correspond to the differences between the incomplete and the original versions in Experiment 7. Full size image

This experiment shows that by inverting the faces the positivity bias is disrupted, which support our hypothesis that in the inverted condition the typicality is less likely to be used to fill in the missing information.