When discussing common sense in the previous article, I began touching upon context in natural language processing for an AGI. Context is a frame of reference from which to interpret a given piece of text or statement.

Wikipedia defines context as:

In semiotics, linguistics, sociology and anthropology, context refers to those objects or entities which surround a focal event, in these disciplines typically a communicative event, of some kind. Context is “a frame that surrounds the event and provides resources for its appropriate interpretation”.

https://en.wikipedia.org/wiki/Context_(language_use)

Context in an AGI is an important feature, particularly for conversational roles. Context, in this sense, can be loosely described as the railroad tracks upon which the conversation rides. For example, let’s say we are discussing housing in the 19th century in England, we would not expect the conversation to randomly drift to penguins in Antarctica or nuclear fusion. Context is about reducing the solution space of possible conversational items to those which are reasonably expected or relevent.

Context is a feature which deep learning techniques fail at miserably. While it is possible to train deep learning networks on a corpus of text to perform a reasonable information extraction, this technique exploits patterns in the structure of language rather than true contextual processing.

True contextual processing is compute intensive, its about constructing a relationship map between statements, phrases, questions, surround information, etc. and employing statistical analysis techniques to narrow down the appropriate context.

A statement such as “I love you.” without surrounding information remains ambigious. Taken at face value, it is a proclamation of love, if sarcastic the opposite and if said to a friend it could be a reference to gratitude or quality of the friendship. To identify the context, surrounding information must be tracked and related to the statement. For example, if two people are passionately kissing, it is likely it means a proclamation of love. But if someone is rolling their eyes at the time, or the tone is exaggerated, it is likely that it means the opposite.

An AGI requires an application/engine which tracks speech/text and other forms of relevent information in order to resolve context. Surrounding information is not constrained to information provided in the conversation, but also includes information from knowledge bases and common sense.

Let’s look at some accidental double Entendres as an example of where ambiguity in statements is resolved by knowledge:

Panda mating fails: veterinarian takes over

Miners refuse to work after death

New obesity study looks for larger test group

Children make nutritious snacks

Criminals get nine months in violin case

https://examples.yourdictionary.com/double-entendre-examples.html

In each of the above cases, we note that the ambiguity is resolved by applying what we would term as common sense knowledge. Applying common sense does not necessarily mean that we arrive at truth, as weird things happen. For example, common sense would state that under normal consitions humans do not eat children making the statement ‘Children make nutritious snacks’ humorous. But, if it were the lead story in a publication called ‘Cannibal Times’, it may very well be a legitimate opinion.

As such, when determining context, we must be aware of all the surrounding information with higher rankings given to facts over general expectations.

Context, as we discovered in the previous article, is also related to memory. In many situations, context of statements can refer to earlier conversations and, in some cases, without explicit reference. For example, a code phrase like ‘the pidgeon is in the bath’ will refer to a pre-agreed interpretation rather than a literal description of a current event. As such, analysis of the statement leads to junk results without knowledge of the pre-agreed meaning. An engine tracking context must be flexible enough to permit the inclusion of ad-hoc context mappings.

Another scenario is where gestures, glances, out-of-context statements, etc., indicate a relationship to a previously discussed context. For example, a conversation may have been about the health of a family member and several hours later the statement ‘he’s ok’ along with a smile may occur. Connecting the two contexts is a question of stasticial likelihood. The context engine must be able to connect contexts over an extended time period and maintain the information gathered.

Double entendre are another interesting class of ambiguity in speech/text. Here though, the real context must be inferred from the imagry presented. For example, ‘he grasped the long hard cold steel rod firmly’. Any context engine therefore must be able to connect words with images and then with similar items. Double entendres are just one class of speech/text where sub-text take precedence over the presented text. This is also common in works with political undertones or where taboos are addressed.

The context engine is highly complex application and difficult to get operating in real-time. In some cases, re-interpretations can happen with additional processing and, in interactive scenarios, must be introduced into the conversational speech of the AGI as corrections. If an AGI is presented as a single entity, like a person, this can mean waiting for a while. But, if the AGI is presented as multiple entities, such as a group of people, then interruptions can occur as data becomes available and presented as a group discussion.

A contextual engine is dependent on a solid knowledge base and common sense information of the world and behaviour. Without this, mis-interpretation will occur quite frequently and the goal of natural language processing/understanding will not obtained.