ChatGPT vs Dialogflow CX

Recently OpenAI released ChatGPT, which is actually a very clever chatbot.

And people started comparing it with existing chatbot platforms like Dialogflow.

I have been working with Dialogflow for over 5 years. I will explain my views on this topic in this chapter.

But given that ChatGPT is very clever, I decided to ask it (after all – why not? 🙂 ) whether we can use ChatGPT as a Dialogflow replacement.

TLDR: That’s actually a pretty good answer.

ChatGPT vs Dialogflow

I created a course recently which takes a specific dataset (the Hacker News FAQ page) and compares GPT API and Dialogflow, and it turned out to be a good way to compare ChatGPT and Dialogflow.

Feature Comparison (Dialogflow is much more powerful for certain tasks):

ChatGPT vs Dialogflow Feature Comparison

Dialogflow is sometimes more accurate than ChatGPT:

Comparing the accuracy of ChatGPT and Dialogflow

The use cases of ChatGPT and Dialogflow are very different:

Extractive Question Answering (GPT) vs Intent-based Question Answering (Dialogflow)

Trying to replicate Dialogflow features using ChatGPT (that is, the GPT API) is not a good idea:

“Writing an OpenAI chatbot is hard! 2 months in and still unsuccessful”

Hallucination

ChatGPT tends to hallucinate.

That’s what this person is talking about.

Which leads to another question – is it a good idea to use ChatGPT without some kind of supervision/verification?

Intent Detection

You might be thinking that the goal of an FAQ chatbot is to answer the user’s question. That is partly true.

But the main goal of designing a bot is to be able to first understand what question the user asked so you can perform a suitable action.

Suppose you are building your own version of Google Home.

Using ChatGPT to figure out what the user said is only partially helpful, because you need to know the specific task the user wants to do (e.g. turn on the light) and then actually perform an action based on the user’s intent (for example actually turn on the light).

ChatGPT does not provide a way to get this information. All chatbot frameworks, including Dialogflow CX, provide a set of API methods to get this information.

Entity extraction

This is actually a crucial task that ChatGPT cannot do.

Very often, you not only need to identify the user’s intent, but also extract the relevant parameter values.

ChatGPT does not provide a way to extract entities. Dialogflow not only provides a way to extract entities, it offers a couple of additional benefits:

a) it offers a wildcard entity to capture free form user input

b) it uses the entity provided in the user utterance for its intent detection

Obviously, ChatGPT does not even have the concept of entity extraction, so it cannot really compete with Dialogflow on this front.

State Management

Keeping track of the current state of the conversation is an important part of building a conversational chatbot which is expected to answer follow up questions.

While ChatGPT does keep track of state internally, it does not expose the current state as an API method. Once again, Dialogflow CX already does this, which means it is much easier to design your conversation flow using CX.

Since ChatGPT does not expose the state of the conversation as an API method, you cannot really design your conversation flow using it.

A well known limitation of ChatGPT

A well known limitation with ChatGPT is that it provides confident wrong answers. This can be extremely misleading. Sometimes your bot should be able to say it does not know the answer.

This is important for two reasons.

A confident wrong answer might assure the user that the answer is correct when it isn’t.

The second reason is that if you the bot maker do not know when your bot provides the wrong answer, there is no way to add more intents to fix the problem.

For example, the Training feature in Dialogflow is more or less designed on the expectation that the bot will fail to provide an answer (so it will invoke the Fallback intent) and that a human will be able to review and improve the bot.

Integrating ChatGPT and Dialogflow

A student of mine recently asked if it was possible to integrate ChatGPT and Dialogflow.

While I think Dialogflow is better than GPT for building complex bots, it is worth noting that you can use the GPT API to improve Dialogflow accuracy.

That is, instead of choosing one or the other, you can combine them to get better results.

For example, suppose we define the following:

Here is how you can use GPT to handle these four cases – True Positives, True Negatives, False Negatives and False Positives.

Using ChatGPT for a website bot

Here are some more questions I got about ChatGPT

I use Dialogflow, but I don’t know how to use Chat gpt as a chatbot for a web site or whatsapp. How do develop a Chatboot in ChatGPT that answer for specif questions for a specific industry, and simple questions for an specific client. Like prices, locations, etc. That would be super usefull.

I would love to learn how to customize ChatGPT responses for a specific business or website.

How would a, for exemple, pizza bot,  be implemented with ChatGPT versus DF NLU. 

What is common in all these question is that they are asking if ChatGPT can be used for a specific use case.

My recommendation is that you should continue using Dialogflow (ES or CX) or maybe some other chatbot framework until ChatGPT provides entity extraction as well as more primitives to design the flow of your chatbot.

Is it a good idea to use ChatGPT without supervision?

There was an interesting thread recently on HackerNews where the commenter claims that the AI search provided by Bing (which uses ChatGPT) made some defamatory statements.

This is called the hallucination problem, and I have written about it before.

That is why I don’t recommend using ChatGPT without supervision and cross verification.

Note: This is my old website and is in maintenance mode. I am publishing new articles only on my new website. 

If you are not sure where to start on my new website, I recommend the following article:

Is Dialogflow still relevant in the era of Large Language Models?

Leave a Reply