Thinking Machines Lab ships its first model and argues interactivity is what OpenAI gets wrong about voice

Mira Murati's Thinking Machines Lab has released its first AI model, positioning it as a direct challenge to OpenAI's GPT-4o Realtime and Google's Gemini Live. The startup argues that existing voice AI systems fundamentally misunderstand how humans interact with language.

The core difference lies in processing speed and architecture. Thinking Machines' model ingests audio, video, and text simultaneously in 200-millisecond chunks rather than waiting for complete utterances before responding. This parallel processing creates what the team calls true interactivity. OpenAI's voice mode, despite its real-time marketing, still operates on a question-and-answer framework that forces users into discrete conversational turns. Murati's model abandons this constraint entirely.

The timing matters. Voice interfaces represent the next major shift in how users access AI. OpenAI's GPT-4o Realtime impressed early users with latency under one second, but Thinking Machines identifies a deeper problem. Even fast responses don't capture natural conversation. Humans interrupt, overlap speech, and respond to incomplete thoughts. They don't wait for questions to end before beginning answers.

Thinking Machines' approach mirrors how humans actually listen and think. By processing audio streams in 200-millisecond blocks, the system can generate output while the user still speaks. This eliminates the artificial pause between input and response that currently defines voice AI interactions.

The model processes three modalities simultaneously, not sequentially. This parallel architecture differs fundamentally from systems that transcribe audio, process text, then generate speech. Integration happens at the model level from the start.

Competition in voice AI intensifies. Google and OpenAI both control integrated ecosystems that give them distribution advantages. Thinking Machines enters as a pure model play, betting that interactivity quality alone drives adoption. The startup must convince developers and

Thinking Machines Lab ships its first model and argues interactivity is what OpenAI gets wrong about voice

Running Claude Code or Claude in Chrome? Here's the audit matrix for every blind spot your security stack misses

Hugging Face hosted malicious software masquerading as OpenAI release

Google made agentic AI governance a product. Enterprises still have to catch up.

Get Daily AIWireDaily