Prompt Engineering for Structured Outputs

Lesson notes for my Prompt Engineering for Structured Outputs course

Generating the comparison dataset

For each LLM, you need 100 valid API responses All the responses should have valid JSON Do a retry until each of the 100 reports contains valid JSON

Read More Generating the comparison dataset
Semantic Validation

Code

Read More Semantic Validation
Syntactic validation

Code

Read More Syntactic validation
Code Walkthrough – Validate API response

I used Claude to generate this code walkthrough from the Python script. This script validates the API response and marks contains_valid_json as True or False. You can download the Python script (.py file) from my Measuring LLM Accuracy course. The script is part of a bigger project and cannot be run without pulling in code…

Read More Code Walkthrough – Validate API response
Code Walkthrough – Save API response

I used Claude to generate this code walkthrough from the Python script. This code saves the response from the LLM API to the local folder. It is also used for doing retries in case previous calls to the API did not produce valid JSON. You can download the Python script (.py file) from my Measuring…

Read More Code Walkthrough – Save API response
Easily separate AI hype from real progress

Using LLMs to extract structured data will allow you to easily separate AI hyper from genuine progress It allows you to use a systematic process (like the one I explain in this course) to get an intuition for how well AI is able to do certain tasks When you see how often AI can fail…

Read More Easily separate AI hype from real progress
Agentic AI

Some people refer to agents as “models using tools in a loop” Understanding how structured data extraction works will be an important part of learning about agentic AI since it is often the extracted structured data that is sent to the tool

Read More Agentic AI
Keep on top of recent developments in “reasoning”

You can run this script for each LLM on OpenRouter and get a good idea of how things are evolving in terms of LLM reasoning I use this approach in this course to evaluate the ability of many different LLMs to extract structured data (so you can just get this course if you don’t want…

Read More Keep on top of recent developments in “reasoning”
A really good way to test LLM capabilities

Keep a list of inputs and a predefined schema, and ask the LLM to extract the structured data and measure how well it does For a sufficiently complex schema (which you will use in this course), and reasonably long text, you will be surprised at how often even the best LLMs fail to produce complete…

Read More A really good way to test LLM capabilities
How to use Pydantic with OpenRouter

Here is a simple code example to get started System instruction Input text Pydantic Schema

Read More How to use Pydantic with OpenRouter
Pydantic Schema

This is the Pydantic schema used in the course. As you can see, this is a very complex schema which is used to extract structured data from a VAERS report. The complexity of this schema acts as a very good test for the quality of an LLM.

Read More Pydantic Schema
System Instruction

This is the system instruction class This provides one file where you can modify your system instructions to see which instruction gives you the best results

Read More System Instruction