Website Name Change
I have changed the name of this website from Mining Business Data to BotFlo. I am offering a 60% off discount on both my Dialogflow ES and Dialogflow CX courses till April 20th 2021 for people who can help me spread the word about my new website.
I have mentioned in many of my articles and videos that I recommend using a context lifespan of 1 in all your intents. I have collected all the related ideas into this single article, and also updated it with my current view as of June 2020. This article is based on a few different articles I had written previously on the topic of context lifespan.
Optimal context lifespan in DialogFlow
Note: this is a somewhat advanced topic, and certainly a bit opinionated. I wouldn’t recommend beginners get into this article until they have built at least a toy bot and experienced all the features in Dialogflow.
If you are building bots using Dialogflow, you are probably aware of contexts. They are used to maintain state. For the specific purpose of state management, I find their implementation quite fascinating simply because they have this concept of “lifespan”.
Lifespan of a context
What is the lifespan of a context you ask? It is the number of “steps” for which a context is alive. As your user interacts with your bot, the remaining lifespan keeps ticking down by 1 per interaction until it hits zero and becomes inactive.
While the default value is set at 5, I suggest you immediately change it to the optimal value.
The optimal value
And the optimal value, in my view, is 1.
But let us first start with the problem of leaving the context lifespan at 5.
Conversations tend to meander
What I mean is, the bot asks the user a very pointed question, such as “OK, so how many red roses would you like to buy?”
To which the user says, “I am not sure. How much per rose?”. Now if you didn’t expect the user to ask this question, you now have a ticking lifespan clock and you better get the answer you want out of the user before the remaining lifespan goes to zero.
I don’t suggest designing bots with a strange constraint of trying to get the user to the right answer within a given amount of tries. The user is not playing a game of hangman with your chat bot, and the conversation will turn weird very fast if you enforce such constraints.
The machine learning is good, but not perfect
In other words, sometimes the user does give you a variant of the expected answer, but it is not recognized as a defined intent. I would propose this is actually a worse outcome than when the user actually typed in something unrelated. Why? Simply because whatever you are going to say to “recover” from this error, is very likely to further confuse the user who had typed in a perfectly reasonable answer already.
Don’t forget the first step in voice to text conversion
This is related to my previous point, of course. Remember, there is a small probability that the user’s words were incorrectly translated into text. In that case, you run into the same issue if you suggest that the user may have been on the wrong track while keeping a lifespan clock ticking.
Pre-existing domains can add unexpected complexity
I talked a little about this in my article on building a cricket stats chatbot with API.AI. The issue here is that there are pre-existing domains for which Dialogflow already populates entities. A good example is the name Steve Smith, who is a popular cricket player with a very common last name. Dialogflow identifies the Smith as a common name, but doesn’t do a good job of mapping to a user-defined entity called “Steven Smith”. It seems to think they are separate and extracts Smith out as a predefined entity and thinks the word Steve should just hang there at the end of the sentence, basically mapped to nothing.
Maybe you argue it is doing the right thing. I don’t think so, but even if it were, this is the kind of silent failure which is sure to mess up your context lifespan.
The implicit state diagram stays deterministic
Using contexts, it is possible to translate any state diagram based conversation flow into a chatbot. However, the state diagram becomes much harder to reason about if you are having lifespans greater than 1 because now you have effectively two different states that your diagram could be in at once. Again, don’t make your chat bot any harder to reason about than it already is.
Wait. So have I not made the case for higher lifespans here, given that they provide a better chance of getting back into the conversation?
This is exactly what a reader asked me recently.
Doesn’t a larger context lifespan help when conversation goes off track?
I got this question on my YouTube channel:
This is a really good point, and yes, the ability to “come back on track” is the reason that the Dialogflow team has chosen a value more than 1. In other words, if the conversation accidentally goes “off track”, doesn’t a higher context lifespan help?
My opinion, based on having helped many clients build bots which work in a predictable manner, is that this benefit isn’t worth the cost.
An example using followup intents
You can take a look at this video to see an example of the problem caused by context lifespan of 2, which is the default in followup intents.
How high should the context lifespan be?
Suppose you think the context lifespan should be greater than 1. The next question is: how high should it be?
2? 5? 50?
The trouble is, the higher your lifespan is, the more candidate intents you will have at every step in the conversation. This can lead to unpredictable behavior (an example of which you saw in the video above).
How many steps will you allow the user to be able to recover and get back on track?
Suppose you want to bring the conversation back on track. How many steps are you going to allow for the user?
In the question that the reader has asked, for example, if the user says “umm” or adds a typo, unless they can correct themselves in the next message itself, the lifespan of 2 isn’t going to be of much help either.
Well, I just want to handle the common case….
Now you might say:
“Well, I just want to handle the common case. So surely a lifespan of 5 (default value) would be fine?”
The problem is, this extra lifespan will keep your intent which already fired as a candidate for 4 more steps in the conversation. Which means, the more complex your conversation, the more you need to account for the earlier intents which fired, making it a much harder chatbot to design.
OK, so how about giving the user just one more chance? That is definitely reasonable?
Yes, it is, and you should do that. But you don’t need to set the lifespan to 2 to be able to do that. There is a better way.
Design a context-based fallback intent, and give some hints to the user so they are better able to correct themselves. Generally, you will find that this in fact gives you much more predictability in your overall conversation design.
Learning from mistakes
Unless you can build a perfect chatbot, or you have perfectly reasonable users, you will notice that when you first roll out your bot, people will say things to it which you are not yet handling properly. My view, in that case, is to retry once, and then exit the conversation (gracefully, of course) and say “We will get better next time. Please try later” or something to that effect.
When you follow this pattern (that is, context lifespan = 1 and a single retry, and then exiting with an appropriate “unsuccessful” message to the user), you will actually be able to narrow down on the issue and resolve it pretty fast.
In contrast, when you have higher context lifespans two things will happen:
1 intents that shouldn’t fire will sometimes fire and confuse the user into providing more unexpected responses
2 by putting the burden on you to manage all the (unnecessarily) active contexts to understand why the wrong intent fired, it will make it harder to diagnose the specific issue the user is facing and will slow down your bot training workflow
Identifying intents which are candidates for selection
So let us take a closer look at how to identify intents which are candidates for selection.
Update Nov 2020: I have created a free tool which helps you identify candidate intents. You can download it here.
Suppose your team decides you are spending way too much time clicking into DialogFlow intents to find what contexts are declared inside of them. So you decide to do the following: you add an intent number at the beginning. Then you also add the context names inside brackets (parentheses) immediately after, and then you write the intent name. You will use empty brackets to indicate that there is no input context. This allows you to see the full picture at a glance. Please don’t actually use this convention. 🙂 It is meant to be a funny illustration.
For example, a very simple example would look like this:
In the agent above, we have two contexts – c1 and c2 and 4 intents.
How contexts affect the intent
In case you were not 100% clear, an intent can fire only if all its input contexts are active. (By fire, I mean it will be selected as the intent to handle a given user input).
For example, unless both c1 and c2 are active, intent 4 will not fire.
What is an active context?
Another important consideration is what exactly we mean when we say a ‘context is active’. Here is my working definition: a context is active if it was set as the output of a previous intent (let us ignore REST API based activations), and the lifespan has not become zero by the time the user types their message.
For example, if an intent which just fired set the context c1 with a lifespan of 1. Now user types a message, and the intent 2 gets fired. Now the lifespan of c1 has become 0 (because one turn of request-response was completed) if intent 2 does not set an output context. At this point, context c1 is not active.
Once again, if this wasn’t super clear, watch the video I mentioned before the beginning of this section.
Based on what we have seen till now, you can make the following statements:
The context powerset determines the candidate selection list
Are you familiar with the notion of the powerset from mathematics? If you have a set S, the power-set of S is the set of all subsets, including the set S as well as the empty set.
To give an example, suppose context c1 and c2 are both active. Now the context powerset is:
[(), (c1), (c2), (c1, c2)]
In other words, an intent which has any of these in the prefix of its name is a candidate for selection.
As you can see, if both contexts c1 and c2 are active, this means all the intents are now candidates for selection. If three contexts are active, say c1, c2 and c3, then the number of subsets is “2 to the power of 3” = 8. In that case, these are the context combinations that will trigger an intent: [(), (c1), (c2), (c3), (c1,c2), (c2,c3), (c1,c3), (c1,c2,c3)]. Generally speaking, if N contexts are active, you will have 2^N such combinations.
An intent with empty context is always a selection candidate
Another thing to notice is that since the empty set is always a subset of every other set, including itself, an intent with empty context is always a selection candidate.
It is quite obvious that intent 1 is a selection candidate if no contexts are active. But what people sometimes miss is that even if the context c1 is active, intent 1 is still a selection candidate.
The intent blackhole
If you have been following along, you might see the problem with an intent which is defined like this:
So what we have here is an intent (number 5) with no input context, and exactly one userSays phrase which has nothing except the wildcard entity. Creating such an intent is not a good idea.
What you have created is basically a perennial selection candidate. Because remember, even if a context has been set, since intent 5 has no input context, it is always a candidate for selection. And since it has nothing more than a wildcard entity, it will accept any and all input.
DialogFlow is able to give you answers for slight variants of a user’s expected input because it gives the user some “margin” to go wrong. Margin is not a standard or official term, but I have quoted it because it is an important concept. Usually, the margin is based on the threshold you set in the agent’s settings. If you set the ML threshold very high (say 0.99), almost nothing will match except exact phrases matching what is already in the userSays. Usually, though, we want lower ML thresholds because we want to encourage a higher margin of error from the user. After all, that is what makes the chatbot look very smart – being able to “pick up” on what the user said even if they don’t exactly match what is already defined in the intent.
When you create this wildcard based intent – every time the user’s message misses its “margin of error” with respect to the other intents, it will invoke this intent. Because DialogFlow thinks, “Heck, why not?” It is like an intent blackhole which consumes everything which comes its way 🙂
Here is a quick summary of which active contexts will enable which intent as a selection candidate. (I am ignoring intent 5 for this).
Set of active input contexts (enables) -> Set of Intents (the Selection candidates)
() -> (1)
(c1) -> (1, 2)
(c2) -> (1, 3)
(c1,c2) -> (1, 2, 3, 4)
The context lifespan is like a “hidden feature” in Dialogflow
The context lifespan you set for an output context is like a “hidden” feature. I mention this because I still see this issue for many people who come to me for help with fixing their Dialogflow chatbot. But I notice they don’t use a lifespan of 1. To make things worse, quite often they don’t see the connection between using the default lifespan and the unpredictable behavior of their chatbot.
In this article, I will point out some benefits of rigorously following a context lifespan of 1.
Why use a context lifespan of 1
So if you have been following along till now, you will see that a context lifespan of 1 provides many benefits as you are designing your chatbot.
1 Understand your chatbot’s behavior completely
When you understand your chatbot’s candidate, target and surplus intents at each step in the conversation, you are going to have a much better understanding of your chatbot, period.
2 Build more complex dialogs
Sure, you cannot yet build a chatbot that can actually talk like a human (unless you wish to end up with a Tay).
But you CAN build fairly complex dialogs if you are able to better guide the conversation along a given path.
3 Better fallback handling
4 Better input validation
Closely tied to the previous point, you can validate the input on your backend and do a better job of guiding the conversation if you strictly use a context lifespan of 1.
5 Create a library of conversation patterns
Understandably, design patterns are sometimes ridiculed.
But when a subject is fairly new (such as building chatbot dialogs), having a library of patterns you can understand and reuse (synergy:management::reuse:programming) is quite helpful.
You can also create a library of these conversation patterns without using the context lifespan of 1. But you will have a hard time managing the behavior.
Example of patterns:
- input validation
- reprompt for input
- slot filling (the concept, not the feature)
- getting multiple inputs from the user
6 Create chatbot building blocks
Once you actually define a few conversation patterns, you will be able to combine them (once again, very hard to do if you don’t enforce the context lifespan = 1 constraint).
What does this mean?
It means you can start programmatically generating your agent’s intents. For example, creating a chatbot which should take a set of user inputs and can also do intelligent input validation and error re-prompting can be done using predefined templates. (Would you be interested in a course about the topic? Let me know in the comments below. If I get at least a handful of comments, I will get to it sooner).
I also got the following input from a fellow Dialogflow freelancer who has worked on many bots for his clients. He once mentioned that he prefers a context lifespan of 2, but that he also agrees with me. So I asked him to explain. Here is what he said.
Lifespans are tricky. But very useful once you know what you are doing, when you already have experience and real world operations’ data to deal with.
Yes: in general I think a lifespan of 2 is a better deal than 1, because with 1 you lose context immediately: you have to think in more intents to prevent losses. With 2 you have the risk of some wrong match in the next step, but this situation occurs less frequently, so the trade off is better. And you can design the conversation in order to reduce the risk of two close groups of words.
But I insist: one needs to have some experience to use a lifespan of more than 1. So: if I, like you often do, talk to newbies and less experienced conversation designers, I start with 1. If/when your audience is more advanced, then you can think about better use of lifespan.
So: I don’t really disagree with you 😉
Here is my simple conclusion: when in doubt, set your context lifespan to 1. Once you fully understand which intents become candidates at each step in the conversation in your bot, consider increasing the value according to your requirements.