First look – Cogito V2 Preview Llama 109B

Cogito V2 Preview Llama 109B is less expensive than Deep Cogito

Created Sep 2, 2025
32,767 context
$0.18/M input tokens
$0.59/M output tokens

And in fact, it was able to provide schema compatible JSON for 84 out of the 99 test VAERS reports. One file did not return a response, and I had to stop the program after a while.

Audit complete. Found 99 JSON files, 84 with valid JSON, updated 84 files.

However it has a major limitation. It does not provide citations for almost any of the data it extracts.

Description from OpenRouter website:

An instruction-tuned, hybrid-reasoning Mixture-of-Experts model built on Llama-4-Scout-17B-16E. Cogito v2 can answer directly or engage an extended “thinking” phase, with alignment guided by Iterated Distillation & Amplification (IDA). It targets coding, STEM, instruction following, and general helpfulness, with stronger multilingual, tool-calling, and reasoning performance than size-equivalent baselines. The model supports long-context use (up to 10M tokens) and standard Transformers workflows. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs

Leave a Reply