Basics of Retrieval Augmented Generation (RAG)

Why everyone who is interested in RAG Search should learn about Discourse AI

Someone made this comment on the Discourse Meta forum (a Discourse forum discussing the features of Discourse)

I personally work with many people who can’t really use any AI in their work (ChatGPT, Claude, Perplexity, etc) because of privacy and other concerns. They also don’t have and can’t afford dedicated AI engineers to build a custom solution, nor are they technical individuals themselves. So I see a very large market gap here for small and medium businesses/organizations, where Discourse AI is the perfect solution. Link

I agree, and I think everyone who is interested in AI powered search should learn about Discourse AI

Friendly for non-technical users
While it is possible to hire a programmer to do custom development work to improve Discourse AI (for example creating special tools), using the core features do not require a background in programming

Discourse AI provides a lot of features out of the box
Discourse AI provides access to nearly every feature that became possible due to LLMs. This means a non-technical user can learn a lot about this topic using practical use cases and see the results for themselves

Polished user experience
Since Discourse itself has been around for more than 10 years (as a forum software), the forum features work very well and provide a polished user experience

Fallback experience is still very good
This is very important when you are using AI. Even if the AI itself sometimes provides a poor answer, it helps if the user interface and the overall user experience provides good fallback options. Since Discourse is first and foremost a forum software, you can simply use its existing functionality and add AI on top of it for a seamless experience which still works even in case the AI doesn’t do a good job.

Four types of LLM questions

There are usually four types

1 World knowledge

Ask a question based on the LLM’s existing world knowledge and hope it knows enough about your domain (almost never sufficient)

2 Prompt based

Paste the FAQ at the top of the prompt and ask questions based on the pasted text

3 Fine tuning

You will tune the LLM using your own data and ask it questions. This is usually quite hard to do and still provides only average results

4 Retrieval Augmented Generation

Search your entire knowledge base for relevant documents (retrieve), then concatenate all the sections in those documents which are related to the question into your prompt (augment) and ask the LLM to generate a response based on it.

RAG often works better than the previous three methods and is the focus of this course.

What is Retrieval Augmented Generation?

This is the basic idea for a forum like this one

Do a search of all your posts for the question
Find all relevant topics and posts (replies)
Concatenate the relevant information into a single large prompt
Using the prompt as the “context”, append the user’s question at the end of the prompt
The LLM will generate a response to the user’s question after reading the full prompt

The 9 components of DIY RAG Search

This is an infographic created by Google for their Vertex AI search

Things to note

there are many moving parts to implement functional RAG search
many of them require Python programming knowledge

Any implementation of RAG search will have to figure out ways to implement all these 9 components