Google’s New Gemini Omni AI Is Too Much. All The Good Stuff From Google I/O. (28 min)

ai-human-identity

ai-in-everyday-life

Release date: 2026-05-20
Listen on Spotify: Open episode

Episode description:

Google I/O 2026 just dropped Gemini Omni, a world-model AI that simulates physics, edits video, and might be the biggest leap since Seedance 2. But it's not perfect. Gavin and Kevin break down everything from Google I/O 2026, including the launch of Gemini Omni (Google's new world model), Gemini 3.5 Flash benchmarks against GPT-5.5 and Opus 4.7, the Gemini Spark personal agent, AskYouTube, Docs Live, new AI glasses, the first search box redesign in 25 years, and the shocking news that Andrej Karpathy is joining Anthropic. SHOW LINKS: Google I/O 2026 Full Keynote: https://www.youtube.com/live/wYSncx9zLIU?si=Nb881MfGTlf1Q0II Gemini Omni physics demos from Google DeepMind: https://x.com/GoogleDeepMind/status/2056786449312493669?s=20 Gemini Omni's incredible London knowledge (via fofrAI): https://x.com/fofrAI/status/2056789242274259242?s=20 Sundar Pichai and Demis Hassabis on Omni video editing: https://x.com/sundarpichai/status/2056524502746747048?s=20 Gavin's hands-on Gemini Omni experiments: https://x.com/gavinpurcell/status/2056762427879182692?s=20 Gemini Omni's character cameo feature (less impressive): https://x.com/gavinpurcell/status/2056772793539481830?s=20 Gemini Omni volleyball fail: https://x.com/flavioAd/status/2056771223359549645?s=20 Google's new Content Credentials Verification: https://x.com/Google/status/2056787498676658576?s=20 Genie 3 IRL — Google's world model now simulates real streets with Street View:  https://techcrunch.com/2026/05/19/googles-genie-world-model-can-now-simulate-real-streets-with-street-view/ Bilawal Sidhu on Genie 3 IRL: https://x.com/bilawalsidhu/status/2056804315721843024?s=20 Gemini 3.5 Flash launches — official announcement: https://x.com/GeminiApp/status/2056788115893993701?s=20 Gemini Spark — Google's new personal coding agent: https://x.com/Google/status/2056791134295273554?s=20 Google's new AI glasses  https://x.com/backlon/status/2056807059707036050?s=20 Andrej Karpathy joins Anthropic to focus on recursive self-learning:  https://www.axios.com/2026/05/19/anthropic-openai-karpathy-andrej-claude

Summary

🎬 Video Editing with Consistent Characters: Gemini Omni enables seamless video-to-video editing: swapping characters and backgrounds while keeping voice, face paint, and beard consistent, demonstrating a huge leap in generative video AI.
🏃 Speed Over Raw Intelligence: Gemini 3.5 Flash is not the smartest model but is up to 4x faster than competitors, suggesting that for daily tasks and agent swarms, speed may matter more than peak performance.
🤖 Persistent Agents That Run for Days: Google’s Spark lets users launch long-running AI agents that operate for hours or weeks across cloud services, connecting to email, Drive, and calendar to autonomously complete tasks.
📹 AI Video Physics Still Falls Short: Despite impressive editing, Omni can still generate physically implausible results (e.g., volleyballs flying backward) and can’t maintain character consistency in very long generations.
🔍 Ask YouTube May Disrupt Tutorial Creators: Google’s upcoming AI-powered Ask YouTube feature will surface and summarize video chapters, potentially reducing creator view counts by letting users get answers without watching full videos.

Insights

Can video generation models now edit real-world footage with character consistency across multiple changes?
- Time: 04:26 – 06:12
- Answer: The podcast demonstrates Gemini Omni’s video-to-video editing capability. Starting with a scientist explaining flatulence, they swap the character to a Viking, then a bee costume, and add a TikTok-dancing wife in the background. The character’s face paint, beard, and voice remain consistent, and the background changes automatically to fit the new character. This shows a leap beyond simple generation into holodeck-like editing where the model understands the context and maintains consistency across edits.
Will video generation models replace the need for manual motion graphics and explainer videos?
- Time: 09:45 – 10:03
- Answer: The hosts see Omni as a game-changer for education and explainer content. Rather than creating motion graphics from scratch, users can prompt the model to create a claymation or comic-style video explaining a complex concept. The built-in styles and seamless character/background swapping suggest this could significantly reduce production time and cost for educational videos.
Why do AI video models still struggle with basic physics and object consistency?
Are watermarks and content credentials enough to prevent AI-generated deepfakes?
Will AI reshape how we search and consume video content, making traditional tutorials obsolete?
Is speed becoming more important than raw intelligence in large language models?
Are we moving from one-shot AI interactions to persistent agents that run over days or weeks?
Does talent movement from OpenAI to Anthropic signal a shift in frontier model development?