In this article I’ll give a simple introduction to the idea of Semantic Modelling for Natural Language Processing (NLP).

Semantic Modelling (or Semantic Grammar) is often compared to Linguistic Modelling (or Linguistic Grammar) and it is probably best to begin by defining both and understand Semantic Modelling in a contrast.

Linguistic vs. Semantic

Semantic and Linguistic Grammars both define a formal way of how a natural language sentence can be understood. Linguistic grammar deals with linguistic categories like noun, verb, etc. Semantic grammar, on the other hand, is a type of grammar whose non-terminals are not generic structural or linguistic categories like nouns or verbs but rather semantic categories like PERSON or COMPANY .

Both Linguistic and Semantic approach came to a scene at about the same time in 1970s. Linguistic Modelling enjoyed a constant interest throughout the years (as part of Computational Linguistic movement) and is foundational to overall NLP development.

Semantic Modelling in its turn enjoyed an initial burst of interest at the beginning but quickly fizzled due to technical complexities. However, in recent years, Semantic Modelling undergone the renaissance and now it is the basis of almost all commercial NLP systems such as Google, Cortana, Siri, Alexa, etc. It is at the core of DataLingvo as well (the company I work for where we further developed the Semantic Modelling idea).

The easiest way to grasp the difference between Semantic and Linguistic Grammar is to look at the following illustration:

Semantic vs. Linguistic

In picture above the lower and upper sentences are the same but they are processed differently. Lower part is parsed using traditional Linguistic Grammar where each word is tagged with a PoS (Point-of-Speech) tag like NN for nous, JJ for adjective, and so on. The upper part, however, is parsed using Semantic Grammar and instead of individual words being PoS tagged, one or more words form high-level semantic categories like DATE or GEO .

This, of course, is highly simplified definition of Linguistic approach as we are leaving aside co-reference analysis, named-entity resolution, etc.

Trending AI Articles:

That ability to group individual words into high-level semantic entities was introduced to aid in solving a key problem plaguing the early NLP systems — namely a linguistic ambiguity.

Linguistic Ambiguity

Look at the picture below:

Linguistic Ambiguity

Even though the linguistic signatures of both sentences are practically the same, the semantic meaning is completely different. The resolution of such ambiguity using just Linguistic Grammar will require very sophisticated context analysis — if and when such context is even available — and in many cases it is simply impossible to do deterministically.

Semantic grammar on the other hand allows for clean resolution of such ambiguities in a simple and fully deterministic way. Using properly constructed Semantic Grammar the words Friday and Alexy would belong to different categories and therefore won’t lead to a confusing meaning.

Note that an astute NLP readers will notice that these words would have different “Named Entity” resolution apart from having the same PoS tags. In this particular example — it is so. However, in more complex real-life examples named entity resolution proved to be nowhere near as effective.

Semantic Grammar Example

Let’s look at the simple definition of the Semantic Grammar.

Regardless of the specific syntax of configuration the grammar is typically defined as a collection of semantic entities where each entity at minimum has a name and a list of synonyms by which this entity can be recognized.

For example, here’s a trivial definition for WEBSITE and USER entities with their respective synonyms:

<WEBSITE>:

website,

http website,

https website,

http domain,

web address,

online address,

http address <USER>:

user,

web user,

http user,

https user,

online user

Given this grammar the following sentences:

Website user

HTTP address online user

Website online user

will all be resolved into the same two semantic entities:

<WEBSITE> <USER>

Sequence of semantic entities can be further bound to a user-defined intent for the final action to take. Collection of such user-defined intents is what typically constitutes a full NLP pipeline.

The real-life systems, of course, support much more sophisticated grammar definition. There are many different ways to define synonyms, as they are many different types of synonyms themselves; semantic entities can have data types and can be organized in hierarchical groups to aid short-term-memory processing — all of which is unfortunately beyond the scope of this blog. You can find one example of such grammar support here.

Determinism vs. Probabilism

We emphasized deterministic nature of Semantic Grammar approach above. Although specific implementations of Linguistic and Semantic Grammar applications can be both deterministic and probabilistic — the Semantic Grammar almost always leads to deterministic processing.

The reason for that is at the nature of the Semantic Grammar itself which is based on simple synonym matching. Properly defined Semantic Grammar enables fully deterministic search for the semantic entity. There’s literally no “guessing” — semantic entity is either unambiguously found or not.

The resulting determinism of Semantic Grammar is a striking quality. While probabilistic approach can work in many well-known scenario like sentiment analysis, support chatbots, or document comprehension — it’s simply unsuitable for NLP/NLU-driven business data reporting and analytics. For example, it doesn’t really matter if your twitter feed is 85% or 86% positive — as long as it trends in the right direction. However, reporting on sales numbers, on the other hand, must be correct to the penny and has to match precisely with data from the accounting system. Even a high probability result like “your total sales for the last quarter were $100M with probability of 97%” is worthless in all circumstances…

With all the benefits of the Semantic Grammar there’s one clear limitation that hindered its development (at least initially) — namely the fact that it can only apply to a narrow data domain.

Universal vs. Domain Specific

While Linguistic Grammar is universal for all data domains (as it deals with universal linguistic constructs like verbs and nouns), the Semantic Grammar with its synonym-based matching is limited to a specific, often very narrow, data domain. The reason for that is the fact that in order to create a Semantic Model one needs to come up with an exhaustive set of all entities and, most daunting, the set of all of their synonyms.

For a specific data domain it is a manageable task, and the one that’s greatly aided by sophisticated real-life systems. But for the general NLU, as in General Artificial Intelligence (AGI), Semantic Modelling simply won’t work.

In the last decade there was a lot of research in advancing Semantic Modelling with close-loop human curation and supervised self-learning capabilities but the fact remains that Semantic Modelling is best applied when dealing with a specific, well defined and understood, data domain.

It is interesting to note that popular Deep Learning (DL) approach to NLP/NLU almost never works sufficiently well for specific data domains. This is due to the lack of sufficiently large pre-existing training sets required for DL model training. That’s why traditional close-loop human curation and self-learning ML algorithms are prevailing in Semantic Modelling systems.

Curation and Supervised Self-Learning

Human curation (or human hand-off) and supervised self-learning algorithms are two interlinked techniques that help to alleviate the problem of coming up with an exhaustive set of synonyms for semantic entities when developing a new Semantic Model.

These two work as follows…

You begin by creating Semantic Model with the basic set of synonyms for your semantic entities which can be done fairly quickly. Once the NLP/NLU application using this model starts to operate the user sentences that cannot be automatically “understood” by the this model will go to curation. During human curation the user sentence will be amended to fit into the model and self-learning algorithm will “learn” that amendment and will perform it automatically next time without a need for human hand-off.

There are two critical properties in this process:

Human curation changes the user input to fit into existing current Semantic Model, i.e. the user sentence is changed in a way that it can be answered automatically. Typically, it involves fixing spelling errors, colloquialism, slang, removing stop words or adding missing context.

That change (i.e. curation) in user sentence is fed into self-learning algorithm to be “remembered” for the future. Since that change was initially performed by a human that makes this self-learning a supervised process and eliminates the introduction of cumulative learning mistakes.

What’s important in all of this is the fact that supervision allows to maintain deterministic nature of Semantic Modelling as it “learns” further. Using curation and supervised self-learning the Semantic Model learns more with every curation and ultimately can know dramatically more than it was taught at the beginning. Hence, the model can start small and learn up through human interaction — the process that is not unlike many modern AI applications.

Conclusion

Semantic Modelling has gone through several peaks and valleys in the last 50 years. With the recent advancements of real-time human curation interlinked with supervised self-learning this technique has finally grown up into a core technology for the majority of today’s NLP/NLU systems. So, the next time you utter a sentence to Siri or Alexa — somewhere deep down in backend systems there is a Semantic Model working on the answer.