Site icon BotFlo

LLM eval: A case study using multiple Gemini models

In my Udemy course, I build a system where I use four different Gemini models to provide an example of LLM eval – where an LLM is a judge for whether or not an answer is correct.

Basic idea:

This system can be extended to any dataset and to any LLM that you want to benchmark.

Exit mobile version