First look – Cogito V2 Preview Llama 109B
Cogito V2 Preview Llama 109B is less expensive than Deep Cogito
Created Sep 2, 2025
32,767 context
$0.18/M input tokens
$0.59/M output tokens
And in fact, it was able to provide schema compatible JSON for 84 out of the 99 test VAERS reports. One file did not return a response, and I had to stop the program after a while.
Audit complete. Found 99 JSON files, 84 with valid JSON, updated 84 files.
However it has a major limitation. It does not provide citations for almost any of the data it extracts.
Description from OpenRouter website:
An instruction-tuned, hybrid-reasoning Mixture-of-Experts model built on Llama-4-Scout-17B-16E. Cogito v2 can answer directly or engage an extended “thinking” phase, with alignment guided by Iterated Distillation & Amplification (IDA). It targets coding, STEM, instruction following, and general helpfulness, with stronger multilingual, tool-calling, and reasoning performance than size-equivalent baselines. The model supports long-context use (up to 10M tokens) and standard Transformers workflows. Users can control the reasoning behaviour with the
reasoning
enabled
boolean. Learn more in our docs