This article summarizes the theoretic background and history that makes emotion recognition possible. You can find more in my Master thesis.

The Science of Emotions

What is an Emotion

When hearing the word emotion, most people tend to think of happiness, love, hate, or fear. Those are the strong emotions that are experienced through life, consciously classifying them as good or bad. This is because our brain is designed to look for threats and rewards. When one of these is detected, the feeling part of the brain alerts us by the release of chemical messages. At the end, emotions are interpreted as the effects of these chemical messages.



For instance, in the case of a threat, our brain releases the stress hormones adrenaline and cortisol, which prepares us for a fight-or-flight response. On the other hand, when perceiving a reward, our brain releases dopamine, oxytocin, or serotonin, which are the chemicals that make us feel good and motivated to continue such behavior.



In these instances of emotion, the feeling part of the brain reacts way before the thinking part does. Sometimes, the reactions of the feeling brain are so strong that dominates our behavior, preventing us from using the thinking part. This can prevent us from thinking rationally, in such a manner that emotions somehow hijack our brain.



Even though most of our emotional responses happen unconsciously, there are methods in which our thinking can control those emotions. Just thinking of something threatening, like presenting in front of a large crowd, can trigger a negative emotional response. It is in such cases where one can control the emotion by conscious thinking, which in this case could be reducing the importance of the audience, or strong confidence that the delivered presentation will be good. There is an entire research field addressing this methodology, shaped by Herbert Benson’s Relaxation Response.

Components of Emotions

“An emotion is a complex psychological state that involves three distinct components: a subjective experience, a physiological response, and a behavioral or expressive response”.

Subjective Experience While experts accept the universality of the basic emotions, the experience of these emotions in individuals is highly subjective. Even though there are broad labels for certain emotions such as anger or happiness, the manifestation of these in individuals can vary a lot. While anger might mean mild annoyance for someone, it can be a blinding rage for somebody else. Plus, one usually does not experience single emotions, but mixed. An easy example can be starting a new job, in which case one can feel both excited and nervous, in different levels depending on the individual.



Physiological Response Emotions can cause strong physiological reactions. Anxiety can cause sweaty palms, racing heartbeat, or even stomach lurch. Early studies attributed these reactions to the sympathetic nervous system, a section of the autonomic nervous system which controls blood flow and digestion. Nevertheless, recent research targets the brain’s role in emotions, especially the amygdala. This almond-shaped structure has been shown to be linked to motivational states such as hunger or thirst, as well as memory and emotion. Researchers have shown that under threat, the amygdala becomes activated, and that damages to this structure can impair fear response.



Behavioral or Expressive Response The main component taken into account for this thesis is the actual expression of the emotion. Humans have the ability of interpreting emotional expressions in the people around them, something that psychologists refer to as emotional intelligence. Many of these expressions are considered universal (e.g. a smile indicating happiness), while cultural roles tend to provide variety in the expressions (e.g. people from Japan have been discovered to mask displays of certain emotions).

Classifying Emotions

Taking into account the three components, describing human emotion can be done with two different approaches.



The Categorical Description of Affect intends to classify emotions into a determined set of classes. Everyone has heard the words happy or sad, as they have been used at least from the 19th century. From 1972, this approach was heavily influenced by the work of Paul Ekman, who believed that humans universally express a set of six basic emotions: happiness, sadness, fear, anger, disgust and surprise. In 1999, he expanded this list to include embarrassment, excitement, contempt, shame, pride, satisfaction, and amusement.



The Dimensional Description of Affect places a particular emotion into a space with a limited set of dimensions. There are certain variations when determining what the dimensions are, but all include valence (how pleasant or unpleasant the emotion is), arousal or activation (how likely is the person to take action under this emotional state) and control (the sense of control over the emotion). Combining different sets of values in these dimensions can generate more complex emotions.



Out of these two approaches, the Categorical Description of Affect is the one explored in affective computing, given its simplicity and universality claim. The richness of the space in the Dimensional Description is more difficult to automate since it is hard to map expressive responses to certain values of these dimensions.

The Science of Facial Expressions

Facial expressions study the variations of an individual’s appearance due to facial movements under the skin. A facial movement, in turn, is the movement of one or more facial muscles. The mapping between facial movements and facial muscles is many-to-many, which means that one facial movement may involve more than one facial muscle, and one facial muscle can be involved in more than one facial movement. If this last statement seems confusing, think of it in the following way. For certain facial movements, two or more facial muscles need to be contracted. On the other hand, one of those same facial muscles may be contracted in different facial movements.



There is a long history of philosophers and researchers trying to conceive the origin and purpose of facial expressions, within branches such as Creationism, Neuroscience or Psychology.



Facial expressions were first studied in the context of physiognomy and creationism, in which they tried to link a person’s character by their looks, especially the face. Leonardo Da Vinci was one of the first to refute such claims, stating that they were without scientific support.

Forward in the 19th century, Sir Charles Bell, influenced by Creationism, investigated their role in the sensory and motor control. He attributed their purpose to solely human communication, endowed by the Creator. Later on, the french neurologist Duchenne studied the body’s neuromuscular system and how facial expressions are produced by electrically stimulating facial muscles.

Experiments conducted by Duchenne de Boulogne in the 19th century. Adapted from Cambridge University Library.

When studying their origin, facial expressions were first attributed to God, and later to evolution. In the 19th century, Charles Darwin stated that Facial Expressions were evolved behaviors for expressing emotion. Darwin’s claims were later supported by the research of Adam Anderson.



Up until now, there is an ongoing debate on what is the true purpose of facial expressions, and how they increased the chances of survival in the species that used them. On the one hand, there is the role in social communication, specifically in the context of signaling systems. This theory states that the role of facial expressions is a form of nonverbal communication, that expressions can communicate everything from pleasure or displeasure to surprise or boredom. On the other hand, sensory regulation considers them as functional adaptations of more direct benefit to the expresser. When experiencing surprise, humans widely open their eyes, not to communicate such expression, but to enhance their field of vision. In the same way, constricting the nose in disgust reduces the inhalation of harmful substances.

Parametrization of Facial Expressions: Facial Action Coding System (FACS) and Action Units (AUs)

When recognizing facial expressions, the first task involves defining a coding scheme for such facial expressions. There are two main classes of coding schemes. Descriptive coding schemes focus on what the face can do based on surface properties, while judgmental coding schemes describe facial expressions in terms of the latent emotions that generate them.



The most well known example of descriptive coding is the Facial Action Coding System (FACS) developed by Ekman and Friesen, which was later improved in FACS 2002. The purpose of this scheme is to represent all facial expressions as a combination of facial muscles. Facial expressions are coded in action units (AUs), which represent the contraction of one or more facial muscles. FACS also provides the rules for visual detection of AUs and their temporal segments, which are the ordinal intensity of the AU (onset, apex, offset) from when the facial expression emerges until it fades. Having this set of rules, a human can analyze a shown facial expression and subdivide it into specific AUs and their temporal segments. A great survey in the history, trends and approaches for Facial Expression Recognition can be found here.

Examples of some action units extracted from CK+ database.

Conveying Emotions from Facial Expressions

The question is: what differentiates facial expressions from emotions? In the one hand, facial expressions involve the variations in an individual’s face based on different muscles. As mentioned earlier, an emotion is a complex psychological state that involves three distinct components: a subjective experience, a physiological response, and a behavioral or expressive response. As a result, facial expressions are considered an expressive response of emotions. This relation between facial expressions and emotions heavily relies in the Universality Hypothesis. This hypothesis assumes that certain facial expressions are signals of six basic emotional states (happiness, sadness, anger, fear, surprise and disgust) that are recognized by people everywhere, regardless of culture or language. The truth of this hypothesis has remained one of the longest standing debates in the biological and social sciences. One example of such is the disclaim made by Jack et al. which is supported by the result of a survey targeting different cultural groups. Despite these claims, implementations of these methods have shown decent level of generalization and accuracy, which is the reason a generalized solution to recognizing emotions is possible. One of the main contributions to this relation between facial expressions and emotions was developed by Ekman and Friesen, called Emotion FACS (EMFACS), which scores facial actions relevant for the six basic universal emotions. This can be considered an hybrid of descriptive and judgmental coding schemes.