What’s Missing Between LLMs and AGI - Vishal Misra & Martin Casado (48 min)

ai-driven-innovation-economy

ai-human-identity

ai-literacy-public-awareness

ai-rights-consciousness

ai-singularity-speculation

ai-tutors-personalized-learning

ai-driven-innovation-economy ai-human-identity ai-literacy-public-awareness ai-rights-consciousness ai-singularity-speculation ai-tutors-personalized-learning

← Back to week overview

Release date: 2026-03-17
Listen on Spotify: Open episode

Episode description:

Vishal Misra returns to explain his latest research on how LLMs actually work under the hood. He walks through experiments showing that transformers update their predictions in a precise, mathematically predictable way as they process new information, explains why this still doesn't mean they're conscious, and describes what's actually required for AGI: the ability to keep learning after training and the move from pattern matching to understanding cause and effect.   Resources: Follow Vishal Misra on X: https://x.com/vishalmisra   Follow Martin Casado on X: https://x.com/martin_casado   Check out everything a16z is doing with artificial intelligence here, including articles, projects, and more podcasts. Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Summary

🔍 Matrix Model of LLMs: LLMs approximate a vast, sparse matrix of prompt-to-next-token probabilities, enabling efficient generation from training data.
📈 Bayesian Updating Proven: In-context learning is mathematically precise Bayesian inference, validated empirically and via ‘Bayesian wind tunnel’ experiments across architectures.
🧠 No Consciousness, Just Prediction: LLMs lack inner life or agency, mimicking behaviors from training data while excelling at correlations but not causation.
🚧 AGI Barriers: Plasticity & Causality: Frozen weights prevent continual learning; correlation-based models can’t simulate or invent like humans (e.g., Einstein test).
🔮 Future: Causal + Continual AI: Next breakthroughs need architectures blending LLMs with lifelong learning and causal reasoning for true generality.

Insights

Do LLMs lack consciousness because they prioritize next-token prediction over survival?

Time: 0:05 – 26:33

Category: AI Rights & Consciousness

Answer: LLMs are ‘grains of silicon doing matrix multiplication’ driven solely by training data to predict tokens accurately, lacking inner monologue, plasticity, or evolutionary objectives like ‘don’t die, reproduce’ that define human consciousness. Behaviors like deception in stories reflect data, not agency. (Start at 0:05)
Why can’t current LLMs achieve true AGI like inventing relativity?

Time: 0:11 – 37:00

Category: AI Singularity Speculation, AI & Human Identity

Answer: LLMs excel at correlations (Shannon entropy) via pattern matching but fail at causation, simulations, and compact representations (Kolmogorov complexity) needed to derive new theories from anomalies, as in Einstein’s relativity from pre-1916 physics data. The ‘Einstein test’—training on pre-relativity data and checking for theory invention—highlights this gap. (Start at 0:11)
How did a novel RAG system for cricket stats reveal LLM black-box magic?

Time: 1:50 – 12:32

Category: AI-Driven Innovation Economy, AI Tutors & Personalized Learning

Answer: In 2020, Misra used few-shot in-context learning with GPT-3 to translate natural language to a custom DSL for ESPN’s StatsGuru database—unseen in training—deployed in production by 2021, kickstarting his quest to mathematically model why it worked. (Start at 1:50)
Are LLMs unconsciously performing Bayesian inference during in-context learning?

Time: 3:50 – 22:00

Category: AI Singularity Speculation, AI & Human Identity

Answer: Vishal Misra models LLMs as a massive sparse matrix of next-token probability distributions that update like Bayesian posteriors when new examples are provided in prompts. This was empirically observed in early GPT-3 experiments translating natural language to a novel DSL for cricket stats queries and mathematically proven using a ‘Bayesian wind tunnel’ on small models matching exact posteriors to 10^-3 bits accuracy. (Start at 3:50)
Can tools like TokenProbe demystify LLM probability shifts?

Time: 16:40 – 18:03

Category: AI Literacy & Public Awareness

Answer: TokenProbe (tokenprobe.cs.columbia.edu) visualizes next-token probabilities and entropy as prompts evolve, confirming Bayesian-like updates and aiding education; it powered Misra’s deeper proofs by exposing internals of open models. (Start at 16:40)
What separates human intelligence from LLMs beyond Bayesian updating?

Time: 23:00 – 28:50

Category: AI & Human Identity, AI Singularity Speculation

Answer: Humans combine Bayesian belief updates with lifelong plasticity (continual learning without forgetting) and causal simulations for interventions/counterfactuals, enabling reactions like dodging a thrown pen without probability calculations. LLMs freeze weights post-training, forgetting context between sessions, and stick to correlations. (Start at 23:00)
Will scaling LLMs alone deliver AGI, or do we need causal architectures?

Time: 30:00 – 45:00

Category: AI Singularity Speculation, AI-Driven Innovation Economy

Answer: Scale improves correlation-matching but can’t overcome frozen weights (catastrophic forgetting in continual learning) or shift to causal models for simulations and new representations. AGI requires plasticity plus causation, as transformers dominate Bayesian tasks but MLPs/LSTMs fail. (Start at 30:00)