GPT5 vs Gemini Pro 2.5 for Pydantic structured output using OpenRouter
I tested sending 100 requests each to both GPT-5 and Gemini Pro 2.5 to extract information from clinical narratives into a very complex Pydantic schema I used the following metrics: is_pure_json contains_valid_json is_valid_schema to measure how well they adhered to the Pydantic schema Sometimes the LLMs will not even produce valid JSON as a substring…