Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Building Context-Aware Bots Using Servo

DZone 's Guide to

Building Context-Aware Bots Using Servo

Learn how to build a context-aware bot using Servo.

· AI Zone ·
Free Resource

Us developers are super proficient at developing applications: mobile, desktop, or web. These apps usually follow the same flow: the user fills a form (or clicks a button), the application calculates new data to show, and an updated screen is shown. Over the years, architectures and design patterns have evolved to support this type of application development. To name a few, dependency injection, microservices, aspect-based-programming, MVC (Model View Controller), and more.

However, these do not work well for chatbots or voice assistants, which don’t have one form to fill or specific buttons to click. The user can say anything, sometimes unrelated to the current question. A sales bot might ask the user for the size of the shoes she was looking for, and the user would respond by asking the bot if they have boots instead. A car voice assistant might be in the middle of a conversation about finding a restaurant nearby, while the user tells it to call his friend that lives in the area. The friend is not picking up, and the bot is expected to return to the same point it left off.

Of course, one could confine the conversation, but that would obviously hurt the user experience. Or, one could try to provide ad-hoc logical solutions — but that might result in a complex, large, and hard-to-maintain code base.

At its heart, this is a state management problem. Servo is one of the few frameworks that can solve that. It’s done by utilizing Behavior Trees, which is a programming paradigm developed in the software industry that has been developing bots for years now — the gaming industry.

In this tutorial, we will explain how to start a simple bot with Servo. I assume here that you are a developer familiar with Github, NodeJS, and importantly, you know how to use NLU and NLP engines (with intent, entities, etc). If you don’t, there are excellent resources all over the Internet — just search for "Wit tutorial" or "LUIS tutorial."

Getting Started

Servo is an open-source framework, and as such, you can fork it on Github and follow the readme to install and npm start. It should run the behavior tree editor and the Servo server on your local machine, each on its own process. Then, open localhost:8000 in your browser and you’ll get a sign-in screen. After that, select Projects, and open your very own New Project:

Open you own New Project
New Project initial template

This tree represents a bot that deals with a nice set of issues and can serve as a simple tutorial for the framework. Click on Debugger and then on the Run ▶️ button, and you will be asked for your age. Let’s enter a few numbers and see what happens:

  1. If an old age is put in (say, 55), the bot will respond by quoting your age and sending you to vote
  2. If a number smaller than 18 is entered, the response would be that you are too young to vote
  3. At 32, the bot will give you a geeky remark about your age

But what if we enter something completely different?

  1. If you go with something like “Who are you?” you’ll get an answer. Then, you’ll be directed back to the age question
  2. For responses that are not understood, the bot will give you a help message before asking the question again

How Does It Work?

Let’s start by looking at the central node “Age.” Select it and click on the Properties tab. You’ll see that the Type of the node is AskAndMap, and it has a unique GUID and a title. Click on the properties icon, and inside the JSON shown, you’ll see a few interesting items:

First, the prompt member:

“prompt”: [“What’s your age?”,“How old are you?”]

It holds the questions the bot asks the user. If cyclePrompts member is true it will cycle through them, otherwise, it will reach the end and keep with the last one.

Second, the contexts array is used for selecting the right child. Once a user responds to the prompt, the response is sent to an NLU engine (Servo comes with a default model, configured in the root properties). The NLU engine extracts intents and entities from the sentence. These are then matched against the contexts, and the best match is selected.

So, if the user responded with a number (eg 22, fifteen etc.), the first context is going to be selected because it is a number entity, which is expected here:

“contexts”:
[{ “entities”: 
  [ { “contextFieldName”: “age”, 
     “entityName”: “number”,
     “entityIndex”: 0 }
  ]}]

Then, the flow continues downwards for the rest of the conversation. We’ll delve into that in a minute.

If the user responded with something that the NLU didn’t understand, the third context is selected:

“helper”: true,
“default”: true

There are some minute differences between default and helper, but we'll speak on that some other time.

If, on the other hand, the user says something that the NLU recognizes, but it’s contextually different (eg “who are you?”), the bot then selects a context to continue on, based on the intentId. As you can see, some of the contexts are selected with one of several intentId’s.

Behavior Trees

Servo didn’t invent too much proprietary programming methodologies but rather chose to rely on industry standards as much as possible. One of the most successful paradigms, especially in gaming AI, has been Behavior Trees (BTs). Most gaming engines, such as Unity or Unreal, come with a BT editor, and they are very useful to construct rule-based behaviors. Servo is built on top of the super-well-crafted Behavior3 editor written in Javascript by Renato de Pontes Pereira.

A word about AI is in place here. While deep learning receives a lot of hype (and rightfully so) it seems that for many real-world applications, rules are still needed, at least as an orchestration framework. While AI does wonders at classifying big data streams, the outcome needs to evoke some action, and that is best dealt with by rule-based logic, which connects these classifiers to input/output channels. In that sense, Servo combines the best of both of these paradigms.

You could read more about Behavior Trees, but here, I’m going to quickly teach you the important stuff. Let’s a look at the left-hand side of the tree, which is reached through the leftmost child upon entry of an age:

Priority, Sequence, conditions and action nodes
Priority, Sequence, conditions and action nodes

Behavior Trees have a main loop, executing the current node a few times a second. A node execution could return one of three results: Success, Failure, or Currently Running. If a node has any children, it executes its children and then, based on the execution result, returns its own execution result.

A node that has children is called a Composite node, and these have only two main types. Let’s follow the execution path here and understand them.

The ? node is called a Priority node and acts like an OR selector. It tries to execute its children from left to right. If one of the children returns Success, it stops trying and returns a success as well. That’s why it’s called “priority” because this gives priority to the left-most children.

If no child succeeded, then Priority doesn’t succeed either and returns a Failure.

Here, the execution then continues downwards to the → node, called a Sequence. The Sequence is like an AND: it executes its children from left to right, expecting all of them to succeed. If one of the children fails, the whole Sequence fails and returns Failure.

So, the execution continues on down, to the age >= 18 node. This is a Condition, that compares the age to 18 (we’ll talk in a minute on how this comparison is made).

If the Condition succeeds, the execution continues to the "time to vote" node. This is an Action, and as the name implies, it’s where all the action happens. Select it, and you’ll see that it’s a GeneralMessage action that outputs a “you can vote” sentence to the user. We then continue to the green "good-bye" hexagon. This is a sub-tree! Double-click it and you’ll go into it.

What if the age is less than 18? You might want to take a look above before continuing reading and work it out by yourself.

If the age < 18, the Condition fails, causing the → Sequence to fail, and the Priority then goes to the next child: too young.

Easy enough, isn’t it?

Hierarchical Memory

Now let’s look into the details of Conditions and Actions and talk about memories. Select the age>=18 condition and open its properties:

“left”: “context.age”,
“operator”: “>”,
“right”: “18”

What happens here? Actually, you can read the short help section in the Description field of the node. It reads "Compare fields across global, context, volatile, and message memories. Left and right operands should have a dot notation with the object name. Eg: message.chat_message, context.amount, etc. Operator could be any logical operator like ===, <, <==, !==, ==>, etc."

Indeed, this is a simple relational operator, returning true or false for Successor Failure. So far so good. But where did the context.age expression come from?

Well, remember the contexts array in the “age?” node? That’s where it is coming from. Turns out that once a context is selected, all the entities and intents are “mapped” into the fields the context defined. We had for that context:

“contextFieldName”: “age”,
“entityName”: “number”,
“entityIndex”: 0

This defines what happened: the NLU recognized a "number" entity. The system then created an age field in the context. Once there, that field is available to all of the context descendants. This is really important, not in and of itself, but because of a question that it brings in: what if there’s another context? In other words, what if another question is to follow the age question?

Luckily, the way it works is known to anyone who knows anything about object-oriented programming and especially JavaScript inheritance. If another question follows a parent question, then a new child context will be created. If nodes then refer to some context.field, it is searched upwards until it finds a field that matches the name or until it reaches the root of the tree.

This is pretty powerful because once the bot understands the age of the user, and even if it continues to talk about new topics, you can still refer to context.age, and the framework will fetch the most recent talked-about age. By the way, why the fancy name “Hierarchical Memory?” Well, this is probably how our brain identifies entities.

Other types of memory include:

  • Global: whole conversation global memory
  • Message: the latest message arrived from the user
  • Volatile: memory that is never serialized into the database. This is good for in-memory complex objects.
  • Local: per-node memory
  • And also, an undocumented Fsm memory, where one can access properties of the conversation process as defined at the root of the main behavior tree.

With these also comes an important Action type called SetFieldAction. If you need to set a field at one of those memory areas, that’s the place.

Delivering the Message

The last piece still missing in the flow is the message that goes back to the user. How does one go to construct it? For that, you could take a look at the “Time to vote!”GeneralMessage properties:

“prompt”: [“Congrats! at <%=context.age%> you can vote”
           ,“At <%=context.age%> you are old and wise, you can vote!”],

I’m sure you’ve noticed the <%= %> notation. This is a well-known technique for web development called templating.” The <%= tells the framework to evaluate the expression against an object containing all memory areas, so we could also use <%=global.fieldName%> and others. Interestingly, we are not limited to expressions but could use code too. For example, adding <% if (context.age>=100) { %> you are one of our eldest voters <%}%> or <% if (context.age>=100) print ‘you are one of our eldest voters’ %> would print that sentence for those with age>=100.

Templating is available for many actions and node types. Look at the description to see if the node asks specifically for a "dot notation" or “memory field.” If it doesn’t, then you can use templating instead.

Debugging

As you probably saw, Servo comes equipped with a built-in debugger. Assuming you are a developer, it’s pretty straight-forward. You can set breakpoints (leaf only at the time of writing), run, step, and view the different memory areas discussed above.

 Debugger stops at a breakpoint
Debugger stops at a breakpoint

Two important remarks:

  • The breakpoints are reached “post-tick,” which means, after the execution of the node
  • If you change things in the tree, just remember to publish! Although Servo gives you a warning, it’s easy to forget

Thanks for reading! Drop a comment in the comments section and let me know your thoughts.

Topics:
nodejs ,chatbot ,open source ,nlu ,nlp chatbot development ,artificial intelligence ,tutorial ,ai tutorial ,chatbot tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}