Google Is Cooking Again. The I/O Leaks Are Wild. (31 min)
ai-in-everyday-life
- Release date: 2026-05-15
- Listen on Spotify: Open episode
- Episode description:
Thanks to @HPInc & Intel for sponsoring us! More on the Zbook Fury https://bit.ly/4uapNHs Google I/O is next week and the AI leaks are pouring out: a new Spark agent, Veo 4 Omni, Gemini 3.2 Flash that's reportedly 20x cheaper than GPT-5.5. This week on AI For Humans, Google is cooking again and the I/O leaks are stacking up. We dig into Google Spark, a new Gemini agent that may have access to your entire digital life. Veo 4 Omni model leaks suggest deeper reasoning and character consistency, and the model gets math right. Gemini 3.2 Flash is rumored to deliver 90% of GPT-5.5's capability at a fraction of the cost and dramatically faster speeds. There's a new GoogleBook with Gemini built in. And Google is reinventing the mouse cursor, the input device that's been largely unchanged since 1968, with voice AI. Plus, Thinking Machines dropped voice interactivity demos that feel a lot like ChatGPT Voice from two years ago. OpenAI is reportedly already working on GPT-5.6, and Sam Altman is giving away two free months of Codex to companies to drive adoption. Gavin's been experimenting with local open-source LLMs and shares his setup. AND…we get into the data center sickness conversation: infrasound from data centers may be causing cortisol spikes in nearby communities. Figure 03's package sorting livestream proved the robot is autonomous after skeptics accused it of being teleoperated. Unitree dropped a transformable robot. AI KEEPING US UP AT NIGHT. NO MATTER. WE COOK. // Show Links // Google Spark: Gemini's Agent With Access To Your Life https://x.com/kimmonismus/status/2054855742247584231?s=20 Veo 4 Omni Model Leaks: Gets Math Right https://x.com/TomLikesRobots/status/2053845600051798065?s=20 More Veo 4 Omni Examples https://x.com/testingcatalog/status/2053718756799467735?s=20 Omni Model Added To Gemini Web Build https://x.com/testingcatalog/status/2054196983523393857?s=20 Gemini 3.2 Flash At 90% Of GPT-5.5 For Way Less https://x.com/kimmonismus/status/2054887891222802633?s=20 New GoogleBook With Gemini Built In https://x.com/Google/status/2054270454467121187?s=20 Google DeepMind: Rethinking The Mouse Cursor With Voice AI https://deepmind.google/blog/ai-pointer Thinking Machines Voice Interactivity Demos https://thinkingmachines.ai/blog/interaction-models/ Sam Altman: Two Months Of Free Codex For Companies https://x.com/sama/status/2054626219858293128?s=20 Data Center Sickness: Ben Jordan's Video On Infrasound https://youtu.be/_bP80DEAbuo Figure 03 Package Sorting Livestream https://www.youtube.com/live/luU57hMhkak?si=KZHwUdYUwY4SIRUp Brett Adcock: Figure 03 Was Not Teleoperated https://x.com/adcock_brett/status/2054737974710169840?s=20 Unitree Transformable Robot https://x.com/UnitreeRobotics/status/2054067819634159622?s=20
Summary
- 🤖 Google Spark Agent: A rumored AI agent that integrates deeply with Google’s ecosystem (Gmail, Calendar, Docs) to provide seamless, personalized assistance without complex setups.
- 🎥 Unified Multimodal Models: Google’s Veo Omni shows the trend of combining video, reasoning, and audio into one model, moving towards all-in-one content creation tools.
- ⚡ Fast & Cheap AI Models: Gemini 3.2 Flash is reported to be 20x cheaper than GPT-5.5 and 90% as capable, potentially becoming the affordable, fast ‘daily driver’ for AI tasks.
- 🖱️ Context-Aware Mouse Cursor: Google DeepMind is developing an AI-enhanced cursor that offers context-specific suggestions and actions, representing AI integration at the OS level.
- 🏭 Real-World AI Impact: From humanoid robots autonomously sorting packages for 24 hours to hidden health risks of data centers, the episode shows the tangible benefits and challenges of AI infrastructure.
Insights
- How will personalized AI agents like Google Spark reshape our daily productivity?
- Time: 2:23 – 4:06
- Answer: Google Spark is being rumored as an AI agent that integrates deeply with Google services like Gmail, Calendar, and Docs. The podcast suggests that unlike current third-party plugins, Spark would be native and seamless, potentially making it much easier for non-technical users to automate tasks and access their data without complex setups. This could normalize AI assistance in everyday digital life.
- Are all-in-one multimodal models the future of content creation?
- Time: 5:23 – 8:36
- Answer: Google’s rumored Veo Omni model combines video generation, reasoning, and audio in one system. The podcast highlights how this unified approach could eliminate the need for separate tools, making AI-assisted content creation more fluid. Even if individual capabilities don’t yet surpass specialized models like Seedance 2, the integration itself is a significant step forward.
- Can a fast and cheap AI model become your ‘daily driver’?
- Will context-aware AI mouse cursors redefine human-computer interaction?
- How do bidirectional voice models create more natural AI conversations?
- Could local AI models solve privacy concerns with cloud-based assistants?
- What are the hidden health risks of AI infrastructure like data centers?
- Are advanced humanoid robots ready for autonomous factory work?