Home / DialogFlow ES / Dialogflow ES Automated Conversation Testing
DialogFlow ES

Dialogflow ES Automated Conversation Testing

You don’t have much control over what people say to your bot. But you do have control over setting up some conversation tests which make sure that you didn’t break something in your chatbot unexpectedly.

How to break your Dialogflow bot

Here are some ways you can break your Dialogflow ES bot.

Add a new training phrase into an intent

Action: You want to handle a new phrase in an existing intent.

Problem: You have a very similar phrase already declared as part of another intent. As a result, the old phrase now gets mapped to the one you changed, potentially breaking some functionality.

Update an existing intent

Action: You modify the user’s phrase in an existing intent.

Problem: An user phrase which used to get mapped to the current intent is now triggering the fallback intent.

Approve a phrase in the training tab

Action: You see an unmapped phrase in the training tab and believe it should be mapped to an existing intent. So you select the intent, and click on the Approve button

Increase the ML threshold

Action: To get a tighter mapping, you slightly increase the ML threshold.

Problem: You didn’t know this before, but many of your phrases were already being mapped very close to the ML threshold score. After you increase it, the old phrases are getting mapped to the fallback.

Decrease the ML threshold

Action: To get a more generous mapping (or because you don’t want to spend more time adding new user phrases), you lower the ML threshold

Problem: Even junk user phrase inputs with very poor matching starts getting mapped to your intents. As a result, phrases get matched even if an entity is not present. This leads to low quality input data getting into your system.

Dialogflow updates their platform

Now clearly, this isn’t something you can control.

But there are also occasions when your chatbot, which was humming along very well, suddenly breaks. You go to the forum and notice that people are complaining. Turns out, Dialogflow made some updates to their service which breaks the old functionality in some way.

Automated Conversation Testing

In my Improving Dialogflow ES accuracy course, I explain how you can set up Automated Conversation Testing using Python.

Here is a screenshot of the desired result. Using pytest and PyCharm, you can very easily set up automated conversation testing for your Dialogflow ES agent. As you can see in the screenshot, this allows you to quickly check if your existing intents are still working as expected.

Generating test scripts

The larger your bot, the more tedious and difficult it can be to generate suitable test phrases for your test script. This is a big reason why people don’t create automated conversation tests in Dialogflow ES.

You can use the simple “filler word” trick to get around this problem. I also go over this entire system in much more detail in the course.

Dialogflow ES Conversation Testing
Autogenerated YouTube subtitles

Dialogflow ES conversation testing: A simple trick for generating unique test phrases

Dialogflow ES conversation testing: A simple trick for generating unique test phrases

Full article: https://botflo.com/dialogflow-es-automated-conversation-testing/

0:00:00 | So we have already seen how you

Can use the existing training tab phrases that the user has used that is when the Bot when the user has Interacted with your Bot you can use those phrases that the users have provided and you can use them as test phrases for your Intents now the thing that I also mentioned at the beginning is that for full test coverage what you want is to have at least one phrase per intent right but the problem is you

0:00:30 | Don't you don't always have enough coverage

Right you can't just rely on having the user interact with your Bot and being able to take the test phrases from there sometimes you may have Intents where it has never been used and also you know when you're starting to build your Bot of course not many people have Interacted with your Bot or maybe nobody has Interacted with your Bot you don't have any example test traces you can use in those situations what you can do is you can use a very neat trick that I found I I think that

0:01:00 | It's a it's a very interesting one

Because you you're able to use Dialogflow's intent mapping system and use it to your advantage so let me go back to this phrase that we tried before where it's tell me some stuff about you and remember that when I typed out the whole phrase in the response you might remember that the Internet detection confidence came to be one okay so what I'm going to do now is I 'm

0:01:30 | Going to say tell me some instead

Of some stuff I'm just going to say tell me some new stuff about you so I made a small change to the phrase and then if you go and look at the diagnostic Info you will notice that the Internet detection confidence has come to 0.78 it's not 1 anymore okay this is close enough to the original phrase but it's different enough that the Internet detection confidence says

0:02:00 | That it's similar but it's not the

Same which is exactly what we want so now I'm going to make it even more even bigger different so I'm going to say tell me some amazing new stuff about you and it's still mapped to the same intent which is what we want but if you go and look at the score it's you can see that it's come down even more okay so this is giving you some hints right what

0:02:30 | It's telling you is that you can

Add some additional words into your existing training phrases and that will automatically cause the Internet detection confidence score to go down it won't be one anymore now the problem with this is that it's kind of hard to generate useful or rather you can say appropriate words which are like which makes sense which are grammatically correct and which which

0:03:00 | Does make sense it's kind of hard

To do that right on the other hand what I have found by the way just to make it clear I'm going to call these filler words okay so these are filler words where I'm adding these into the test phrase as Fillers so that the Internet detection score is less than one so I have two problems with this approach till now which is the first one is that it's kind of hard to generate appropriate grammatically correct filler words and

0:03:30 | The second thing is that the intent

Detection confidence score is actually a bit too low it's all it's gone all the way down to 0.74 which is you know it's quite far from one so what you can do instead is you can use this very nice trick so what I'm going to suggest is do the cell tell me some stuff about and then instead of having a real word just use the word Blah which is you know the ultimate filler word I guess and then

0:04:00 | You press enter you notice that this

Time it did map to that intent but the intent direction confidence score is higher in fact it's the highest number we have seen till now it's 0.81 okay and I think that this is about the range you can get something like 0.812 I would say 0.85 or so I don't think it can go much higher you can go and test this out I'm not 100 sure about that but I'll say that this I can settle for this but the very

0:04:30 | Neat thing about this is that even

Though this test phrase makes no sense right if somebody were to read this phrase they'll be like what what exactly is this person trying to do on the other hand the word the filler word we are using is not something that you would expect to be in a test phrase or in a training phrase or in user utterance or any of those things right so this is actually a word which you can almost be confident will never appear in in your agent anywhere right so it 's

0:05:00 | A it's a very good unique filler

Word which will not even clash with the other words you're using in your Bot so the advantage of this first of all is that you can see that it only Minim it only reduce the intent detection confidence score by a bit by a little bit may not by a lot and the second advantage is that the word is a proper filler word in the sense that it will not clash with other words you have in your agent okay so my recommendation

0:05:30 | Would be to take these training phrases

And just put this word Blah or maybe another suitable filler word and just put it right before the last word in the sentence and you will notice that it's it's able to identify the phrase okay that is Dialogflow is able to identify the intent correctly at the same time it doesn't cause a big reduction in the confidence score which is exactly what we want so you do this for all the

0:06:00 | Phrases for which you don't have like

Natural test traces and just doing this alone will give you sufficient training phrases to check if your Bot is still working as expected that is it it's going to give you sufficient training phrases or rather test phrases it's going to give you sufficient test phrases to be able to generate the test script for your Bot

Course Preview: Automated Testing Basics

Course Preview: Automated Testing Implementation using Python


This website contains affiliate links. See the disclosure page for more details. 
Check out my free YouTube courses

Dialogflow CX Beginner Tutorial

Dialogflow ES vs CX using a Decision Tree Bot

Intro to NLU for technical non-programmers

Better Dialogflow ES Bots Using the CTFS Framework

Search the autogenerated transcripts of all my YouTube videos
"The magic key I needed as a non-programmer"

The custom payload generator was the magic key I needed (as a non-programmer) to build a good demo with rich responses in DialogFlow Messenger. I've only used it for 30 minutes and am thrilled. I've spent hours trying to figure out some of the intricacies of DialogFlow on my own. Over and over, I kept coming back to Aravind's tutorials available on-line. I trust the other functionalities I learn to use in the app will save me additional time and heartburn.

- Kathleen R
Cofounder, gathrHealth
In this free course, I provide some tips for managing large Dialogflow ES bots without compromising on accuracy.

Similar Posts