Detection and avoidance of sick individuals have been proposed as essential components in a behavioural defence against disease, limiting the risk of contamination. However, almost no knowledge exists on whether humans can detect sick individuals, and if so by what cues. Here, we demonstrate that untrained people can identify sick individuals above chance level by looking at facial photos taken 2 h after injection with a bacterial stimulus inducing an immune response (2.0 ng kg −1 lipopolysaccharide) or placebo, the global sensitivity index being d′ = 0.405. Signal detection analysis (receiver operating characteristic curve area) showed an area of 0.62 (95% confidence intervals 0.60–0.63). Acutely sick people were rated by naive observers as having paler lips and skin, a more swollen face, droopier corners of the mouth, more hanging eyelids, redder eyes, and less glossy and patchy skin, as well as appearing more tired. Our findings suggest that facial cues associated with the skin, mouth and eyes can aid in the detection of acutely sick and potentially contagious people.

1. Introduction

The detection of sick individuals has been proposed to serve as a first line of defence allowing individuals to avoid being contaminated by sick peers [1]. The central premise of disease avoidance is that infections pose a major threat to individuals [2], and that humankind and other animals, through evolution, have developed an ability to limit contagion by avoiding potentially sick peers [1].

While immune cells have evolved to detect molecular patterns of invading pathogens, avoidance behaviours need to be guided by an ability to detect pathogenic carriers via perceptual cues. In animals, chemosensory-guided avoidance of sick conspecifics is well established, and trained animals can detect infections like Clostridium difficile in humans with high sensitivity and specificity [3]. Recent data suggest that humans may possess similar abilities, because acute inflammation-induced sickness is associated with more aversive and less ‘healthy’ body odour [4], and altered gait patterns [5]. While it has been shown that humans exposed to people exhibiting overt sickness behaviours, such as coughing, respond with disgust, anxiety and even a more reactive immune system [6,7], less is known of how well we can detect sickness from observing human faces, and the facial cues that characterize a sick person.

The human face conveys abundant cues about individuals and is a primary source of information in human communication [8,9]. Symmetry, facial adiposity and colouration have been proposed as important factors in how we judge health in other individuals [10]. Accordingly, the faces of acutely sick people are lighter and less red [11], and judged as less healthy [12]. It has also been indicated that people can detect patients with HIV or non-symptomatic herpes on an online dating website at levels above chance [13]. This study was based on self-selected photos of subjects who had identified themselves as having/not having HIV and/or herpes on a dating website. Considering that dating sites are affected by deceptive self-presentations [14], and some of the subjects classified as healthy may actually have been sick, these results may not be generalizable. However, the study still gives an indication that we may be able to detect people with a chronic disease better than chance levels from photos alone. Taken together, there is a lack of well-controlled studies determining how well we can detect acutely sick people, and which cues that guide us in this detection.

Our aim was to determine whether it is possible to identify experimentally induced sick people based on facial pictures and to specify cues that contribute to this identification.

2. Material and methods

(a) Acquisition of photos

The protocol of experimentally induced sickness has been described before [15]. Briefly, 22 healthy volunteers who were free of physiological or psychiatric condition and of medication (except contraceptive pills), all of whom were non-smokers, non-excessive alcohol consumers, non-obese people and aged between 19 and 34, were included in the study. Participants were recruited by advertising and screened through questionnaires and a health examination by a physician. Twenty-two volunteers (mean age: 23 ± 4; 9 women, all Caucasian) were included in the protocol. This was a double-blind, randomized, placebo-controlled study with a cross-over design. The number of subjects was based on previous lipopolysaccharide (LPS) studies for testing main effects of sickness and to provide photos (and other samples) that could be used for additional studies. Subjects received either an injection of LPS (Escherichia coli endotoxin, Lot HOK354, CAT number 1235503, United States Pharmacopeia, Rockville, MD, USA) at a dose of 2 ng kg−1 of body weight, or of placebo (0.9% NaCl) on two different occasions, separated by three to four weeks. LPS injection causes a transient and distinct systematic inflammatory response and sickness behaviour [16].

Facial photographs were taken about 2 h after injection (in both conditions), with a Nikon D90 (Nikon Corp., Tokyo, Japan) at a resolution of 4288 × 2848, using highly standardized procedures and a studio set-up. This time point coincided with a strong increase in inflammatory response, here shown in levels of circulating interleukin (IL)-6, and subjective sickness as measured by the Sickness Questionnaire [17] in the LPS condition (figure 1). Participants wore a white T-shirt and no make-up, and had their hair pulled back from their face. They were seated on a stool in front of a white background, asked to sit comfortably, to look straight into the camera with a neutral expression and to relax their face. In each condition, five to six photos were taken. Figure 1. Illustrations of the timing of when the photos were taken (2 h and 10 min post injection) and (a) mean circulating concentrations of IL-6 and (b) development of sickness (subjective sickness rated on the sickness questionnaire, SQ), after injections with placebo and LPS. All 16 subjects participated in both conditions, and the dotted grey lines show their raw data in the LPS condition.

Of the photos taken during each condition, the three best photos of each participant were chosen based on their quality (e.g. removing photos when subjects were not sitting straight or having eyes closed) and similarity to the other photos of the respective participant taken at the same time. In other words, the three most representative photos were chosen. Photos from six subjects were excluded due to large differences in facial hair (1) or hairdo (2) between the two time points, excessive hair falling into their face (2) or problems with the flash lighting (1). From the photos of the remaining 16 participants (8 women), a grad student, blind to the procedure and purpose of the study, chose the photo (of the three from each condition) from each condition that was the most representative of the three, in terms of its similarity to the other two photos, following a protocol described earlier [18]. This resulted in 32 photos, two of each participant, one from each condition (i.e. photos of 16 individuals when ‘healthy’ and ‘acutely sick’). Photos of subjects from the same study have been used to analyse neural correlates when exposed to multimodal stimuli of sickness [12], and how skin colour (measured with spectrophotometry) changes during acute sickness [11].

(b) Identification of sick individuals

The first rating session included 62 naive observers (31 women, 31 men, mean age 25.5 years, s.d. = 8.8). The subjects did not fill out information about ethnicity, but the large majority were Caucasian. The subjects were recruited at two large universities in the Stockholm area. They rated the 32 photos on whether the person in the photo was sick or healthy in a forced-choice procedure with each photo being shown along with the question ‘Is this person sick or healthy?’ with the response options sick and healthy. The rating procedure was programmed with E-Prime Professional v. 2.0 (Psychology Software Tools, Inc.). Photos were shown for a maximum of 5 s in a pseudo-randomized order, with the restriction that no individual participant was shown twice in a row. Each photo was rated at least once, with a total of 48 ratings per rater. The sample size was, based on effect sizes found in earlier studies [18,19], predetermined to include 60 people or more (resulting in 2880 ratings if no data loss), giving a high power to detect even small effect sizes.

Assessment of accuracy in detection was based on signal detection theory and the receiver operating characteristic (ROC) curve analyses in Stata 12.1. The ROC curve area describes how well one can discriminate between sick and healthy, where 1 would represent a perfect discrimination and 0.5 complete randomness. We also calculated the sensitivity index (d′) from the probability of true sickness detections (t) and false alarms (f) using the inverse cumulative normal distribution.

(c) Relationships between facial cues and apparent sickness

In the second rating session, a separate group of 60 naive observers (38 women, 22 men, mean age 27.3 years, s.d. = 6.2) rated the photos with respect to health (scale from 1 ‘very poor’ to 7 ‘very good’), tiredness (scale from 1 ‘very alert’ to 7 ‘very tired’) and eight facial cues. As in the first rating session, the sample size was predetermined to include 60 people. No information about ethnicity was obtained, but the large majority were Caucasian. The subjects were recruited at two large universities in the Stockholm area. The rating procedure was programmed with E-Prime Professional 2.0. Furthermore, one sickness cue at a time was rated on 7-point Likert scales from 1 ‘no symptoms’ to 7 ‘ very high symptoms’, in a randomized block order. The queries were phrased as follows: ‘How pale lips/pale skin/patchy skin/glossy skin/droopy corners of the mouth/red eyes does this person have?’ and ‘How hanging are the eyelids/swollen is the face of this person?’. For each cue, 48 ratings were made, at least one for each of the 32 photos, with 16 randomly selected photos being shown twice. Again, the same subject was never shown directly after himself or herself. The cues included were considered to characterize sickness by 18 researchers involved in studying sickness behaviour. The procedure was self-paced, with each photograph shown for a maximum of 5 s.

The first set of analyses considered how sickness affects facial cues (i.e. how they changed between LPS and the placebo conditions). The second set of analyses considered how each of the cues related to apparent sickness (i.e. what are the cues characterizing a sick appearance). Data were analysed using multilevel mixed-effects linear regression, with two crossed independent random effects accounting for random variation between observers (rating the photos) and participants (on the photos) using the xtmixed procedure in Stata 12.1. We also analysed two models describing how the included cues mediated the effects of LPS on apparent sickness, both being multiple mediation SEM models where the mediated effects were calculated using the product-of-coefficients method. In the first model, apparent sickness was regressed on the cues and on LPS condition, and the cues were regressed on LPS condition (figure 5). In the second model, apparent tiredness was included as a second-order mediator between the cues and apparent sickness (electronic supplementary material, figure S1). The size and 95% confidence intervals (CIs) of the indirect effects of LPS condition on apparent sickness via the cues were calculated with MPlus 7.3 software and these values were transformed to per cent of the total effect of LPS condition on apparent sickness. As an example, the effect of LPS treatment on pale skin was 0.301 and the effect of pale skin on apparent sickness, adjusting for the effects of LPS treatment and the other mediators, was 0.083. This gives an indirect effect of LPS treatment on apparent sickness via pale skin of 0.301 × 0.083 = 0.025 and as the total effect of LPS treatment on apparent sickness was 0.357, the degree of mediation is 100 × 0.025/0.357 = 7.00%.

3. Results

(a) Detection of acutely sick individuals

The 62 raters gave 2945 ratings of sickness for the 32 different facial photos, of which 1215 (41%) were judged as being sick. Of these positive detections, 775 were true hits and 440 were false alarms, giving a sensitivity and a specificity for identifying sickness from photos of 52% and 70%, respectively; the global sensitivity index being d′ = 0.405. Signal detection analysis (ROC curve area) showed an area of 0.62 (95% CIs 0.60–0.63; 1.0 being a perfect discrimination and 0.5 being random). In addition, the raters could correctly discriminate 13 out of 16 individuals (81%) as being sick better than chance (the lower 95% CI range being above 0.5). These results demonstrate that untrained people can, above chance level, identify acutely sick individuals from merely observing a photo for a few seconds.

(b) Facial cue changes in acutely sick individuals

The second group of naive people (n = 60) rated the same facial photos, to assess the facial cues affected during acute sickness, and to what degree they were associated with apparent sickness and tiredness. The photos were therefore rated with respect to eight facial cues, and to how sick and tired the person in the photo apparent (figure 2). Multilevel regression effects and 95% CIs showed that the LPS injection, when compared with placebo, made people look more sick (b = 0.48 scale steps, CI: 0.41 to 0.55, Z = 12.7, p < 0.001) and more tired (b = 0.45, CI: 0.35 to 0.54, Z = 9.4, p < 0.001). After LPS injection, subjects were also perceived to have paler skin (b = 0.54, CI: 0.45 to 0.64, Z = 11.1, p < 0.001), a more swollen face (b = 0.18, CI: 0.09 to 0.26, Z = 4.1, p < 0.001), paler lips (b = 1.55, CI: 1.45 to 1.64, Z = 31.2, p < 0.001), droopier corners of the mouth (b = 0.42, CI: 0.34 to 0.50, Z = 10.3, p < 0.001), more hanging eyelids (b = 0.45, CI: 0.36 to 0.55, Z = 9.6, p < 0.001) and redder eyes (b = 0.33, CI: 0.26 to 0.40, Z = 8.9, p < 0.001). After LPS injection, the faces also had less glossy skin (b = −0.22, CI: −0.31 to −0.14, Z = 5.1, p < 0.001) and less patchy skin (b = −0.41, CI: −0.50 to −0.31, Z = 8.3, p < 0.001). As illustrated in figures 2 and 3, several facial cues, representing changes of the skin, the eyes and the mouth, were thus affected by acute sickness (figure 3 is a composite of the faces in each condition and included to illustrate the differences). The results also show that paleness of the lips was especially prominent in sick faces. Figure 2. Effects of LPS-induced acute sickness on (a) apparent sickness and tiredness, and cues relating to (b) the skin, (c) the mouth and (d) the eyes, when compared with placebo. The regression lines are estimated after the removal of variation between the observers using empirical Bayes' estimates. Thus, the regression lines represent the average change in the average observer. All effects are significant at p < 0.001. The scales for cues range from 1 ‘no symptoms’ to 7 ‘very high symptoms’. The health–sickness scale ranges from 1 ‘very poor’ to 7 ‘very good’ and reversed for the figure, and tiredness 1 ‘very alert’ to 7 ‘very tired’. Figure 3. Averaged images of 16 individuals (eight women) photographed twice in a cross-over design, during experimentally induced (a) acute sickness and (b) placebo. Images made by Audrey Henderson, MSc, St Andrews University, using Psychomorph. Here, 184 facial landmarks were placed on each image before composites displaying the average shape, colour and texture were created [20].

(c) The facial cues revealing sickness in others

To investigate by which cues we determine whether someone is sick, we analysed how apparent sickness (using the variation from all photos) related to each of the included cues. Illustrated in figure 4, the multilevel mixed-effects linear regressions illustrate that high sickness ratings were related to paler skin (b = 0.18 units on the scale for paler skin for each unit on the sickness scale, CI: 0.14 to 0.23, Z = 7.7, p < 0.001), having paler lips (b = 0.23, CI: 0.18 to 0.28, Z = 8.7, p < 0.001), a more swollen face (b = 0.09, CI: 0.05 to 0.13, Z = 4.1, p < 0.001), more hanging eyelids (b = 0.15, CI: 0.10 to 0.19, Z = 6.4, p < 0.001), more red eyes (b = 0.09, CI: 0.06 to 0.13, Z = 5.2, p < 0.001) and droopier corners of the mouth (b = 0.06, CI: 0.02 to.10, Z = 3.0, p = 0.003). Apparent sickness was also related to looking tired (b = 0.30, CI: 0.27 to 0.33, Z = 20.1, p < 0.001). However, apparent sickness was not significantly related to having glossy skin (b = −0.02, CI: −0.06 to 0.03, Z = 0.8, p > 0.250) nor having a patchy skin (b = 0.00, CI: −0.05 to 0.04, Z = 0.1, p > 0.250). Thus, the regression analyses show that all the included cues, with exception of glossy or patchy skin, were positively correlated with apparent sickness. These data indicate a high consistency between the cues affected by acute sickness and those of appearing sick. Although having pale lips was the cue most prominently displayed in sick individuals, it was not the strongest predictor of how sick the people in the photos appeared. Instead, a number of cues (including pale skin and hanging eyelids) were similarly related to appearing sick. Figure 4. Relationships between apparent sickness and facial characteristics. All significant regressions illustrated by dashed lines in black, and non-significant regression lines by solid light grey lines (see text for detailed statistics). The regression lines and the data points (individual data points in grey being jittered to better illustrate the distributions) are estimated after the removal of variation between the observers using empirical Bayes estimates. Thus, all observers have been adjusted (in level) to represent an average observer. The plots consist of 2856–2873 ratings each (60 observers rated the 32 photos, some photos were rated twice by each observer) on 7-point Likert scales (1 = ‘no symptoms’, 7 = ‘very high symptoms’).

The mediation analysis, in which all cues were included in the model simultaneously, found the degree of mediation to vary between −2.8% (patchy skin) and 10.1% (pale lips). A negative degree of mediation means that the indirect effect has an opposite sign compared with the total effect. Although pale lips had the highest point estimate, due to its weak adjusted effect on apparent sickness, the mediated effect was very unstable and non-significant. Instead, pale skin and hanging eyelids turned out to be the most reliable and significant mediators, with a degree of mediation of 7.0% and 5.0%, respectively. All this said, it should be noted that 75.9% of the effect of LPS treatment on apparent sickness was direct (i.e. non-mediated; figure 5). The second mediation analysis, with tiredness included as a second-order mediator, showed that apparent tiredness is a possible mediator of how LPS affects apparent sickness (electronic supplementary material, figure S1), although the degree of mediation of the cues was not affected to any higher degree. Figure 5. The effect of LPS on apparent sickness, directly and via the mediators patchy skin (Pat_S), droopy mouth (Dro_M), pale lips (Pale_Lips), glossy skin (G_S), swollen face (S_F), red eyes (R_Eye), hanging eyelids (H_Eye) and pale skin (Pale_Skin). The effects of cues on apparent sickness are β-weights (to the right, under the heading ‘effect on apparent sickness’) and the effects of LPS correspond to Cohen's d (to the left, under the heading ‘effect on cues’). The placement of the mediator along the x-axis corresponds to the degree of mediation (percentage mediation, vertical line inside box), with 95% CI (width of the box). The scales for cues range from 1 ‘no symptoms’ to 7 ‘very high symptoms’. *p < 0.001, †p < 0.05.

A correlation matrix illustrates the relationships between the cues (electronic supplementary material, table S1). The strongest positive correlation was the association between pale lips and pale skin (0.34, p < 0.001), and the strongest negative correlation was between pale skin and patchy skin (−0.28, p < 0.001). A correlation analysis showed that the rating of pale face was negatively correlated to the objective measures (with spectrophotometer) of redness (−0.56, p < 0.001).

4. Discussion

We demonstrate that a transient stimulation of the innate immune system affected the human face in a way that allows others to identify acutely sick individuals beyond chance by merely observing facial photographs. While the lion's share of the previous literature has used photos of obviously sick people to induce disgust, anxiety and even immune responses [6,7,21], the photos here were taken in an experimental setting, with neutral facial expressions, only 2 h after onset of a systemic inflammation. This supports the notion that humans have the ability to detect signs of illness in an early phase after exposure to infectious stimuli [4,5]. It would arguably be particularly beneficial to identify sick individuals at an early stage of sickness when risk for contagion is high [4].

Several facial cues were affected during acute sickness, with paleness of the lips being the most prominent. Interestingly, the most robust predictors of apparent sickness were pale skin and hanging eyelids in the models. These findings suggest that paleness and having a tired appearance (both looking tired and having hanging eyelids) are markers of actual sickness. This is consonant with the fact that redness signals a healthy and attractive appearance in both humans and animals [10,22] and that appearing tired is strongly related to appearing healthy [18,23]. This is further supported by the fact that some of the cues shown to relate to sickness in the present study (i.e. pale skin, hanging eyelids and red eyes) have previously been shown to relate to apparent fatigue [19]. The fact that the cue changing the most during acute sickness, pale lips, was not strongly related to a sick appearance in the model was surprising, and further studies, preferably at different stages of sickness, will have to investigate this in more detail. While it is well known that facial adiposity and skin colour affect apparent health [10], the present study provides additional clues regarding which facial cues we use to detect acutely sick people.

Considering the partial overlap between signs of sickness and tiredness [18], as well as sadness [5], it can be assumed that such signs sometimes trigger avoidance of people who pose no threat of contagion. This is backed up by the finding that subjects are less inclined to socialize with individuals who have got insufficient sleep [23]. In addition, disability stigma has been theorized to be a consequence of an overly inclusive disease-avoidance mechanism [24]. Taken together, this suggests that perceived deviations from a healthy or functional state, based on cues that overlap between sickness and other conditions, perpetuate prejudices. Such behavioural tendencies would have been favoured by selection pressures to avoid false-negative responses when scanning the environment for imminent infectious threats.

The findings in this study are limited to visual detection of the consequences of an innate immune response, which is a non-specific immune reaction to a bacterial stimulus (LPS are molecules of the cell wall from gram-negative bacteria, and not contagious themselves). In other words, this reflects a general sickness state. It is likely that more disease-specific facial features would develop over time in different diseases. Thus, analyses of facial appearance from other disease models and in later phases of responses to infection are needed before it is possible to determine the cues by which we most reliably can predict the presence of disease in a fellow human being. The fact that the predictive power of the ratings was rather low (ROC area being 0.62) is not surprising, considering that raters were only very briefly exposed to the photos. In real-life circumstances, we would expect humans to have a higher sensitivity due to the possibility of integrating other cues [12] (e.g. gait [5], body odour [4] and speech). The degree to which practice can improve this ability is unclear. It is, however, tempting to speculate that health professionals have the ability to use perceptual cues as guidance in clinical judgement, something that was recently proposed [25].

It is well known that humans almost immediately judge facial aspects of attractiveness, trustworthiness, dominance and expressions of basic emotions when looking at a human face [26]. Future studies need to distinguish how facial expressions of sickness overlap with those of basic emotions such as anxiety or fear, and how promptly humans scan for signs of illness in peers.

5. Conclusion

The data presented here support the notion of humans being able to detect acutely sick individuals at a glance from photos and that a number of facial cues guide this judgement. Several facial cues were affected by sickness, with paleness of the lips standing out as a prominent signal. There is a need to further test how accuracy can be improved, for example through learning, and whether identification is similar across diseases and ethnic groups.

Ethics

Photos of sick and healthy individuals were obtained in the Center for Clinical Research at Danderyd Hospital, Stockholm, Sweden, and all participants and research staff were blind to the conditions, except one physician for safety reasons. All photographed subjects gave written informed consent after the study protocol had been fully explained and received a compensation of 3500 SEK (approx. 370 euros). As the rating procedure did not collect any personal data, nor subjected the raters to any psychological, physiological or emotional risks, a written consent for the raters was not required. The raters received a cinema ticket in compensation for their participation. The study was approved by the regional ethical review board in Stockholm, Sweden (Registration number 2015/1415-32) and registered in ClinicalTrials.gov (NCT02529592).

Data accessibility

Data are available at https://osf.io/btc7p/ [27].

Author contributions

J.A. and M.L. developed the study concept and drafted the manuscript. J.L. and C.A. collected the data. J.A., K.S., M.J.O., J.L., T.S. and M.L. analysed and interpreted the data. All authors developed the design, revised and accepted the final version of the manuscript for submission.

Competing interests

We have no competing interests.

Funding

This research was supported by the Swedish Foundation for Humanities and Social Sciences (P12-1017 to M.J.O.), the Swedish Research Council (421-2012-1125), Karolinska Institutet and Stockholm Stress Center. J.L. is funded by the Alexander von Humboldt Foundation (Germany, Humboldt fellowship for postdoctoral researchers).

Acknowledgements We thank Anne Soop PhD MD and Sofie Paues Göransson MD, Department of Clinical Sciences, Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden, for help with the design and data collection; Emma Denlert BSc and Joel Åkerblom, Karolinska Institutet, Stockholm, Sweden, for the help with the data collection; Michael Ingre PhD for the help with statistical expertise. We also thank Audrey Henderson MSc, St Andrews University, St Andrews, UK, for the help with averaging the images. None of these individuals were compensated for their contributions.

Footnotes

Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3951916.