Behavior Trees

Servo didn’t invent too much proprietary programming methodologies, but rather chose to rely on industry standards as much as possible. One of the most successful paradigms, especially in gaming AI, has been Behavior Trees (BTs). Most gaming engines, such as Unity or Unreal, come with a BT editor, and they are very useful to construct rule-based behaviors.Servo is built on top of the super-well-crafted Behavior3 editor, written in Javascript by Renato de Pontes Pereira.

A word about AI is in place here. While deep learning receives a lot of hype (and rightfully so) it seems that for many real world applications rules are still needed, at least as an orchestration framework. While AI does wonders at classifying big data streams, the outcome needs to evoke some action, and that is best dealt with by rule-based logic, that connect these classifiers to input/output channels. In that sense, Servo combines the best of both of these paradigms.

One could read more about Behavior Trees, but here I’m going to teach you quickly just the important stuff. Let’s a look on the left hand side of the tree, which is reached through the leftmost child upon entry of an age:

Priority, Sequence, conditions and action nodes

Behavior Trees has a main loop, executing the current node a few times a second. A node execution could return one of three results: Success, Failure or Currently Running. If a node has any children, it executes its children and then, based on the execution result, returns its own execution result.

A node that has children is called a Composite node, and these have only two main types. Let’s follow the execution path here and understand them.

The ? node is called a Priority node, and acts like an OR selector. It tries to execute its children from left to right. If one of the children returns Success, it stops trying and returns a success, too. That’s why it’s called “priority”, because this gives a priority to the left-most children.

If no child succeeded, then the Priority doesn’t succeed either, and returns a Failure.

Here, the execution then continues downwards to the → node, called a Sequence. The Sequence is like an AND: it executes its children from left to right, expecting all of them to succeed. If one of the children fails, the whole Sequence fails, and returns Failure.

So, the execution continues on down, to the age >= 18 node. This is a Condition, that compares the age to 18 (we’ll talk in a minute on how this comparison is made).

If the Condition succeeds, the execution continues to the ‘time to vote’ node. This is an Action, and as the name implies, it’s where all the action happens. Select it, and you’ll see that it’s a GeneralMessage action, that outputs a “you can vote” sentence to the user. We then continue to the green ‘good-bye’ hexagon. This is a sub-tree! Double-click it and you’ll go into it.

What if the age is less than 18? You might want to take a look above before continuing reading and work it out by yourself.

If the age < 18, the Condition fails, causing the → Sequence to fail, and the Priority then goes to the next child: too young.

Easy enough, isn’t it?

Hierarchical Memory

Now let’s look into the details of Conditions and Actions, and talk about memories. Select the age>=18 condition and open its properties:

“left”: “context.age”,

“operator”: “>”,

“right”: “18”

What happens here? Actually you can read a short help section in the Description field of the node. It reads ”Compare fields across global,context, volatile and message memories. left and right operands should have a dot notation with the object name. Eg: message.chat_message, context.amount etc. Operator could be any logical operator like ===, <, <==, !==, ==> etc. “

Indeed, this is a simple relational operator, returning true or false for Success or Failure. So far so good. But where did the context.age expression come from?

Well, remember the contexts array in the “age?” node? That’s where it is coming from. Turns out that once a context is selected, all the entities and intents are “mapped” into the fields the context defined. We had, for that context:

“contextFieldName”: “age”,

“entityName”: “number”,

“entityIndex”: 0

This defines what happened: the NLU recognized a ‘number’ entity. The system then created an age field in the context. Once there, that field is available to all of the context descendants. This is really important, not in and by itself, but because of a question it brings in: what if there’s another context. In other words, what if another question is to follow the age question?

Luckily the way it works is known to anyone who knows anything about object-oriented programming, and especially JavaScript inheritance. If another question follows a parent question, then a new, child context will be created. If nodes then refer to some context.field it is searched upwards, until it finds a field that matches the name or until it reaches the root of the tree.

This is pretty powerful, because once the bot understood the age of your user, and even if it continued to talk about new topics, you could still refer to context.age and the framework will fetch the most recent talked-about age. By the way, why the fancy name “Hierarchical Memory”? Well, this is probably how our brain identifies entities.

Other types of memory include:

Global : whole conversation global memory

: whole conversation global memory Message : the latest message arrived from the user

: the latest message arrived from the user Volatile : memory that is never serialized into the database. This is good for in-memory complex objects.

: memory that is never serialized into the database. This is good for in-memory complex objects. Local : per-node memory

: per-node memory And also, an undocumented Fsm memory, where one can access properties of the conversation process, as defined at the root of the main behavior tree.

With these comes also an important Action type, called SetFieldAction. If you need to set a field at one of those memory areas, that’s the place.

Delivering the Message

The last piece still missing in the flow is the message that goes back to the user. How does one goes to construct it? For that you could take a look at the “time to vote!” GeneralMessage properties:

“prompt”: [

“Congrats! at <%=context.age%> you can vote”,

“At <%=context.age%> you are old and wise, you can vote!”

],

I’m sure you’ve noticed the <%= %> notation. This is a well known technique for web development called “templating”. The <%= tells the framework to evaluate the expression against an object containing all memory areas, so we could also use <%=global.fieldName%> and others. Interestingly, we are not limited to expressions, but could use code, too. For example, adding <% if (context.age>=100) { %> you are one of our eldest voters <%}%> or <% if (context.age>=100) print ‘you are one of our eldest voters’ %> would print that sentence for those with age>=100.

Templating is available for many actions and node types. Look at the description to see if the node asks specifically for a ‘dot notation’ or “memory field”. If it doesn’t, then you can use templating instead.

Debugging

As you probably saw, Servo comes equipped with a built-in debugger. Assuming you are a developer, it’s pretty straight-forward. You can set breakpoints (leaf only at the time of writing), run, step, and view the different memory areas discussed above.

Debugger stops at a breakpoint

Two important remarks: