The non-programmer’s guide to integrating GPT into Dialogflow

A while back I wrote an article about integrating Dialogflow and GPT.

While most of the stuff I wrote in that article is still valid, the field of Large Language Models (LLMs) is evolving rapidly – and sometimes to its own detriment.

One important evolution is the recent release of PaLM version 2 (Google’s GPT).

Is Palm2 better than GPT4?

Broadly speaking, it is not.

But Palm2 is better than GPT 4 for at least two tasks – generating video transcript subheadings and automatically clustering sentences into intents.

Intent detection is an important feature which will affect the improvement you can expect when you are integrating any LLM into Dialogflow. So it follows that you must also evaluate PaLM 2 instead of only GPT 4 for this particular task.

Real time responses

Let us first consider the example of incorporating a GPT response into your Dialogflow bot using webhooks.

A lot of people want to use GPT as the fallback for their Dialogflow chatbot, because

a) sometimes it can come up with a really good answer for questions which are out-of-scope for your bot

b) constructing good fallback flows in Dialogflow usually requires a lot of work

Dialogflow ES vs Dialogflow CX

This brings us to the next question: are you using Dialogflow ES or Dialogflow CX?

Dialogflow ES has a 5 second timeout for the webhook, and as a result the latency of response from your GPT model can be too high to integrate GPT into your ES bot.

Dialogflow CX has a substantial 30 second timeout for the webhook, so it is much easier to integrate your GPT model into your CX bot.

Online vs offline

Given that you can use Palm2 to cluster sentences into intents, you can also use it to improve the accuracy of your Dialogflow bot in a second way – you can also use it offline (meaning it is not real-time but you do an analysis based on the conversation history).

In other words, you can send your conversation history to PaLM 2 and ask it to evaluate if your Dialogflow bot mapped the correct intent.


One of the most important differences between a pure Dialogflow bot and LLMs lke GPT/PaLM2 is that the LLMs can also utilize their knowledge of the world outside your bot and use that to construct suitable responses. In other words, it is quite reasonable to regard an LLM as some kind of an "independent external grader" when you are using it to improve the accuracy of your bot.

Finetuned model vs ad hoc classification

There is actually a fourth aspect of this integration that is technically feasible.

You can just directly ask an LLM to classify the user utterance into one of many possible “intents”.

In fact, if you are building a bot which does exactly one thing – classify the user’s utterance to direct them to a specific department on your website, I would even recommend this approach. And this is also very easy to do using Generative Dialogflow CX.

But as your chatbot gets larger and handles more and more intents, sending an ad hoc query to your LLM does not scale very well because

a) the size of your prompt text will grow rapidly

b) there are quite a lot of things you can do using Dialogflow’s built-in features which are just too much work to reconstruct using only a text prompt.

So a fine-tuned model is better for this purpose.

The summary tables you see below are for fine-tuned models which are based on your Dialogflow bot’s training phrases (that is, it must closely mirror the existing structure of your chatbot).

Here is a quick summary:

Dialogflow ES chatbot

OnlineLatency too high, not recommendedPaLM2 fine tuning has just been released, I will be doing some tests of latency over the next few weeks
OfflineCan be useful, but PaLM 2 will likely be a better choiceWill likely be a better choice than GPT4

Dialogflow CX chatbot

OnlinePossiblePossible, but using generative Dialogflow CX will be a better choice
OfflineCan be useful, but PaLM 2 will likely be more accurateWill likely be a better choice than GPT4

While I have pointed out why PaLM 2 will be a better option than GPT 4 based on intent clustering, using GPT 4 offline to improve the accuracy of your Dialogflow bot is probably a good idea to get started for now since PaLM 2 is not generally available at the moment.

About this website

BotFlo1 was created by Aravind Mohanoor as a website which provided training and tools for non-programmers who were2 building Dialogflow chatbots.

This website has now expanded into other topics in Natural Language Processing, including the recent Large Language Models (GPT etc.) with a special focus on helping non-programmers identify and use the right tool for their specific NLP task.

1 BotFlo was previously called MiningBusinessData. That is why you see that name in many videos

2 And still are building Dialogflow chatbots. Dialogflow ES first evolved into Dialogflow CX, and Dialogflow CX itself evolved to add Generative AI features in mid-2023