02 5 ways Dialogflow is better than ChatGPT

In the previous lesson, I explained how “Extractive Question Answering” which is more suited to ChatGPT, differs from “Intent based Question Answering” which is more suited to Dialogflow.

In this lesson, I explain how this makes Dialogflow a better choice for certain tasks.

In my view

a) Dialogflow is much more suitable for non-programmers

b) Dialogflow is much more suitable for multi-turn conversation flows

c) Dialogflow is much more suitable for scenarios where convenience is more important than optimal benchmark performance

Here are 5 ways Dialogflow is better than ChatGPT:

Zero hallucination

Since the bot creator defines the exact response provided by the Dialogflow bot, you can be certain that as long as a certain intent is mapped, that exact response will be seen by the user. This provides a lot more control over the exact verbiage seen by the bot user.

One consequence of non-deterministic GPT behavior is hallucination.

While GPT versions are getting much better at avoiding hallucination, the previous example where it combined the words “personal” and “stories” into a single phrase “personal stories” is still an example of word rearrangement which entirely changes the meaning of the response.

A second type of hallucination in GPT is “out-of-scope” answers.

It is possible for GPT to provide answers which are not even in the prompt you sent.

For example, here is a response it sent for the question “How is Ask HN different from Show HN?”

Ask HN is for questions and other text submissions, while Show HN is for sharing personal work. Show HN has special rules, such as no links to commercial products or services.

To the best of my knowledge, this additional special rule about “no links to commercial products” is not true, and isn’t mentioned in the prompt or even in the links on the FAQ page (here are the actual special rules – there is no mention of the word “commercial”). I think GPT just made up this rule based on seeing a lot of similar phrases during its web crawling!

There is no reason to be concerned about hallucination in Dialogflow because the response is completely specified by the bot creator.

Entity extraction

Sometimes you have custom entities in your Dialogflow bot, and the current support for entities in ChatGPT is quite rudimentary.

The OpenAI suggests you do entity extraction by designing a suitable prompt.

{"prompt":"<any text, for example news article>nn###nn", "completion":" <list of entities, separated by a newline> END"}

I expect this approach to be unpredictable and quite poor for a while, because this is a good example of reinventing the wheel using an inferior approach.

There are already very well understood approaches to defining and populating entities in Dialogflow, and as a bonus it is also exposed via a well defined API (meaning it is not just easy for non-programmers to work with, it is also easy for programmers to automate).

When you try and force fit entity definition and entity extraction into an LLM prompt, and also end up losing the ability to edit/update entities using an API, it is a good example of LLM Maximalism. I suggest avoiding LLM Maximalism at this early stage when these GPT features are still new and somewhat immature.

State management

You can use explicit contexts to manage state in Dialogflow ES, but Dialogflow CX takes this much farther and in fact allows you to design the entire conversation flow as a state machine.

The suggested approach for state management in ChatGPT is to send the entire conversation back and forth (appending the most recent user response) right into the prompt.

In addition to a lot of token usage, and using the prompt for a task for which it may not be very well suited (another example of LLM Maximalism), this is also just simply much harder to do because of the human vetting recommended in addition to the huge number of examples needed (a few thousand).

The contrast with a state-machine based bot framework like Dialogflow CX would become quite apparent once you realize that designing an actual state machine allows you to

a) use much fewer training phrases per state transition

b) understand the entire state machine (i.e. conversation flow) at design time because you have an inbuilt, interactive state diagram

c) you can easily test the state machine using the inbuilt simulator

d) you can easily modify the state machine with a few mouse clicks by opening the detailed view

In other words, until you spend at least a few days designing the same conversation flow in Dialogflow CX, you will probably not realize how much more efficient it is to use Dialogflow CX for state management in chatbots.

Accuracy Evaluation

How do you evaluate the accuracy of your GPT bot?

This is actually a bit more tricky than you think, and does require some careful thought.

Calculating the accuracy of your bot requires some tooling or custom development work even for a mature bot framework like Dialogflow.

Evaluating the accuracy of a GPT bot will require a lot more custom development work.

Tune ML model using custom training phrases inside a GUI

In both Dialogflow ES and Dialogflow CX, you can improve your chatbot’s ML model using your own training phrases, using a simple graphical user interface. This is very helpful for non-programmers.

While you can still do something similar in GPT if you are a programmer, turning it into an actual GUI which a non-programmer can use will involve a lot of custom development work.

Summary

GPT is still very new and the underlying technology is a bit too immature to be used for multi-turn conversation flows inside chatbots.

I think it is still very good for use cases which involve extractive question-answering, but you would be better off choosing Dialogflow if your use case requires intent-based question answering.


About this website

I created this website to provide training and tools for non-programmers who are building Dialogflow chatbots.

I have now changed my focus to Vertex AI Search, which I think is a natural evolution from chatbots.

Note

BotFlo was previously called MiningBusinessData. That is why you see that watermark in many of my previous videos.

Leave a Reply