Using Lang.ai for Unsupervised Intent Induction

Our technology allows the automatic analysis of your chatbot data. In a previous post, we discussed about the advantages provided by our unsupervised technology to the chatbot development. For instance, following the previous customer service chatbot example, it allows you to analyze and explore the different intents addressed by your clients, identify them in new user utterances and give them a proper resolution. Let’s see how it works.

Firstly, we intend to detect the intents in the user utterances, as well as their related aspects (e.g., objects or entities): “Who did what to whom” (and perhaps also “when and where”). Such structural knowledge (i.e., intents plus their related aspects), usually referred as Frames, is an explicit prerequisite in many NLP tasks, Natural Language Understanding among them.

Usually, frames have been manually constructed by domain experts or linguists, thus constraining their application to a small number of specific and well-defined domains: let’s imagine to try to manually construct these frames for a dialogue system (e.g., SIRI), covering all the potential conversations, it would be crazy, wouldn’t it?. Therefore, the identification of such frames has to be addressed in an automatic fashion. This identification has been mainly tackled along two different lines (Jauhar and Hovy, 2017):

Modeling the frame structure by means of the selectional preference of predicates for certain arguments (Seaghdha, 2010); e.g., giving a high probability to the word “pasta” occurring as an argument of the word “eat”. Inducing the frames by clustering predicates and arguments in a joint framework (Lang and Lapata, 2011; Titov and Klementiev, 2012) which associates predicates such as “eat”, “consume”, “devour”, with a joint clustering of arguments such as “pasta”, “chicken”, “burger”.

In the particular field of chatbots, this induction is also a key aspect for the analysis of user requests and the conversation management. However, the way users speak to conversational interfaces makes this step slightly different from traditional natural language discourse analysis. For instance, in a dialogue system users directly request specific actions to be taken (I want to book a flight, turn off the lights, call my mom) with usually a unique intent per utterance (i.e., if there are more, they tend to be related by simple coordinators: and, but, or, …). In addition, when users realize they are talking to a non-human interface, they tend to simplify the message and use more direct requests, avoiding complex conversations.

Quote from Shakespeare’s Hamlet: “Neither a borrower nor a lender be; For loan oft loses both itself and friend, and borrowing dulls the edge of husbandry.” Typical Chatbot Request: I have a problem with my WiFi

In this sense, we relax the frame definition to base our representation on action-driven intents, which seem to be preferable in order to understand user utterances, thus focusing on two main roles: verb (what is the user requesting for) and object (objective of the user request):

I want to book [verb] a flight [object]

We base this process on a preliminary shallow-parsing approach to detect a set of initial verb-object candidates, which will then be modeled by means of the words appearing together with them. It follows the Distributional Hypothesis in which words in the same contexts tend to have similar meanings.

“You shall know a word by the company it keeps” Firth (1957)

For instance, given the user request: I want a reservation in “Fancy Restaurant” tomorrow at night, this would be its dependency parsing (i.e., the analysis of the grammatical structure of the sentence):