Introduction Into Semantic Modeling for Natural Language Processing

DZone 's Guide to

Introduction Into Semantic Modeling for Natural Language Processing

In this article, I’ll give a simple introduction to the idea of Semantic Modeling for Natural Language Processing (NLP).

· AI Zone ·
Free Resource

In this article, I’ll give a simple introduction to the idea of Semantic Modeling for Natural Language Processing (NLP).

Semantic Modeling (or Semantic Grammar) is often compared to Linguistic Modeling (or Linguistic Grammar), and it is probably best to begin by defining both and understanding Semantic Modeling in a contrast.

Linguistic vs. Semantic

Semantic and Linguistic Grammars both define a formal way of how a natural language sentence can be understood. Linguistic grammar deals with linguistic categories like noun, verb, etc. Semantic grammar, on the other hand, is a type of grammar whose non-terminals are not generic structural or linguistic categories like nouns or verbs, but rather semantic categories like PERSON or COMPANY.

Both Linguistic and Semantic approaches came to the scene at about the same time in the 1970s. Linguistic Modeling enjoyed a constant interest throughout the years (as part of Computational Linguistic movement) and is foundational to overall NLP development.

Semantic Modeling enjoyed an initial burst of interest at the beginning but quickly fizzled due to technical complexities. However, in recent years, Semantic Modeling has undergone the renaissance, and now it is the basis of almost all commercial NLP systems such as Google, Cortana, Siri, Alexa, etc.

The easiest way to grasp the difference between Semantic and Linguistic Grammar is to look at the following illustration:

Image title

In the picture above, the lower and upper sentences are the same, but they are processed differently. The lower part is parsed using traditional Linguistic Grammar where each word is tagged with a PoS (Point-of-Speech) tag like NN for nouns, JJ for adjectives, and so on. The upper part, however, is parsed using Semantic Grammar, and instead of individual words being PoS tagged, one or more words form high-level semantic categories like DATE or GEO.

This, of course, is a highly simplified definition of the Linguistic approach as we are leaving aside co-reference analysis, named-entity resolution, etc.

That ability to group individual words into high-level semantic entities was introduced to aid in solving a key problem plaguing the early NLP systems — namely a linguistic ambiguity.

Linguistic Ambiguity

Look at the picture below:

Image title

Even though the linguistic signatures of both sentences are practically the same, the semantic meaning is completely different. The resolution of such ambiguity using just Linguistic Grammar will require very sophisticated context analysis — if and when such context is even available — and in many cases, it is simply impossible to do it deterministically.

Semantic grammar, on the other hand, allows for clean resolution of such ambiguities in a simple and fully deterministic way. Using properly constructed Semantic Grammar, the words Friday and Alexy would belong to different categories and therefore won’t lead to a confusing meaning.

Note that an astute NLP readers will notice that these words would have different “Named Entity” resolution apart from having the same PoS tags. In this particular example — it is so. However, in more complex real-life examples named entity resolution proved to be nowhere near as effective.

Semantic Grammar Example

Let’s look at the simple definition of the Semantic Grammar.

Regardless of the specific syntax of configuration, the grammar is typically defined as a collection of semantic entities where each entity at the minimum has a name and a list of synonyms by which this entity can be recognized.

For example, here’s a trivial definition for WEBSITE and USER entities with their respective synonyms:

  http website,
  https website,
  http domain,
  web address,
  online address,
  http address
  web user,
  http user,
  https user,
  online user

Given this grammar, the following sentences:

  • Website user
  • HTTP address online user
  • Website online user

will all be resolved into the same two semantic entities:

Sequence of semantic entities can be further bound to a user-defined intent for the final action to take. Collection of such user-defined intents is what typically constitutes a full NLP pipeline.

The real-life systems, of course, support much more sophisticated grammar definition. There are many different ways to define synonyms, as they are many different types of synonyms themselves; semantic entities can have data types and can be organized in hierarchical groups to aid short-term-memory processing — all of which is unfortunately beyond the scope of this blog. You can find one example of such grammar support here.

Determinism vs. Probabilism

We emphasized the deterministic nature of the Semantic Grammar approach above. Although specific implementations of Linguistic and Semantic Grammar applications can be both deterministic and probabilistic, the Semantic Grammar almost always leads to deterministic processing.

The reason for that is at the nature of the Semantic Grammar itself, which is based on simple synonym matching. Properly defined Semantic Grammar enables fully deterministic search for the semantic entity. There’s literally no “guessing” — semantic entity is either unambiguously found or not.

The resulting determinism of Semantic Grammar is a striking quality. While probabilistic approach can work in many well-known scenarios like sentiment analysis, support chatbots, or document comprehension, it’s simply unsuitable for NLP/NLU-driven business data reporting and analytics. For example, it doesn’t really matter if your twitter feed is 85% or 86% positive — as long as it trends in the right direction. However, reporting on sales numbers, on the other hand, must be correct to the penny and has to match precisely with data from the accounting system. Even a high probability result like “your total sales for the last quarter were $100M with the probability of 97%” is worthless in all circumstances.

With all the benefits of the Semantic Grammar there’s one clear limitation that hindered its development (at least initially) — namely the fact that it can only apply to a narrow data domain.

Universal vs. Domain Specific

While Linguistic Grammar is universal for all data domains (as it deals with universal linguistic constructs like verbs and nouns), the Semantic Grammar, with its synonym-based matching, is limited to a specific, often very narrow, data domain. The reason for that is the fact that in order to create a Semantic Model, one needs to come up with an exhaustive set of all entities and, most daunting, the set of all of their synonyms.

For a specific data domain, it is a manageable task and the one that’s greatly aided by sophisticated real-life systems. For the general NLU, as in General Artificial Intelligence (AGI), Semantic Modeling simply won’t work.

In the last decade, there was a lot of research in advancing Semantic Modeling with close-loop human curation and supervised self-learning capabilities, but the fact remains that Semantic Modeling is best applied when dealing with a specific, well-defined and understood data domain.

It is interesting to note that popular Deep Learning (DL) approach to NLP/NLU almost never works sufficiently well for specific data domains. This is due to the lack of sufficiently large pre-existing training sets required for DL model training. That’s why traditional close-loop human curation and self-learning ML algorithms are prevailing in Semantic Modeling systems.

Curation and Supervised Self-Learning

Human curation (or human hand-off) and supervised self-learning algorithms are two interlinked techniques that help to alleviate the problem of coming up with an exhaustive set of synonyms for semantic entities when developing a new Semantic Model.

These two work as follows. You begin by creating Semantic Model with the basic set of synonyms for your semantic entities which can be done fairly quickly. Once the NLP/NLU application using this model starts to operate the user sentences that cannot be automatically “understood” by the this model will go to curation. During human curation the user sentence will be amended to fit into the model and self-learning algorithm will “learn” that amendment and will perform it automatically next time without a need for human hand-off.

There are two critical properties in this process:

  • Human curation changes the user input to fit into existing current Semantic Model, i.e. the user sentence is changed in a way that it can be answered automatically. Typically, it involves fixing spelling errors, colloquialism, slang, removing stop words, or adding missing context.
  • That change (i.e. curation) in user sentence is fed into a self-learning algorithm to be “remembered” for the future. Since that change was initially performed by a human that makes this self-learning a supervised process and eliminates the introduction of cumulative learning mistakes.

What’s important in all of this is the fact that supervision allows maintaining the deterministic nature of Semantic Modeling as it “learns” further. Using curation and supervised self-learning, the Semantic Model learns more with every curation and ultimately can know dramatically more than it was taught at the beginning. Hence, the model can start small and learn up through human interaction — the process that is not unlike many modern AI applications.


Semantic Modeling has gone through several peaks and valleys in the last 50 years. With the recent advancements of real-time human curation interlinked with supervised self-learning, this technique has finally grown up into a core technology for the majority of today’s NLP/NLU systems. So, the next time you utter a sentence to Siri or Alexa — somewhere deep down in backend systems there is a Semantic Model working on the answer.

ai, artificial intelligence, chat bot, deep learning, machine learning, natural language processing, nlp, nlu

Published at DZone with permission of Aaron Radzinski , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}