
Thinking Machines Unveils Full-Duplex AI That Listens While It Talks
Mira Murati's Thinking Machines Lab introduces interaction models — AI that processes input and generates responses simultaneously, responding in 0.40 seconds like natural human conversation.
What Are Interaction Models?
Thinking Machines Lab, founded by former OpenAI CTO Mira Murati, announced a new concept called "interaction models" on May 11, 2026. Unlike every AI model before it, this system processes your input and generates a response at the same time — making conversations feel like a phone call rather than a text chain.
The technical term is "full duplex." Their model, TML-Interaction-Small, responds in just 0.40 seconds, roughly matching the speed of natural human conversation and significantly faster than comparable models from OpenAI and Google.
Why This Matters for AI Conversation
Current AI models work in a strict turn-taking pattern: you speak, it listens, then it responds, and you listen. This creates awkward pauses and makes real-time collaboration feel stilted. Full-duplex AI breaks this pattern entirely.
Imagine an AI assistant that can notice you're about to interrupt and pause mid-sentence, or one that can react to your facial expressions during a video call. That's the world interaction models are building toward.
Current Status and Availability
This is a research preview, not a consumer product. Thinking Machines is releasing a limited research preview in the coming months, with a wider release planned for later in 2026. The benchmarks are impressive, but real-world performance remains to be tested.
The underlying idea — that interactivity should be native to a model architecture rather than bolted on afterward — represents a genuine paradigm shift in how we think about AI communication.
How This Could Change Business Communication
For businesses, full-duplex AI could transform customer service, sales calls, and internal meetings. AI agents could handle real-time negotiations, detect customer frustration mid-conversation, and adjust their approach instantly. The latency reduction alone makes AI-powered phone support dramatically more natural.
Common Questions (FAQ)
Q: When can I try Thinking Machines' interaction model? A: A limited research preview is coming in the next few months, with wider access planned for late 2026.
Q: How is this different from real-time voice features in ChatGPT? A: ChatGPT's voice mode still processes in turns. Full duplex means simultaneous listening and speaking, like a real phone call.
Q: Will this work for enterprise applications? A: The company hasn't announced enterprise pricing or availability yet, but the architecture is clearly designed for production use cases.
Stay ahead of the AI curve. Follow @AiForSuccess for daily insights.
📬 Want more AI solopreneur insights?
Subscribe to our weekly newsletter →Related Articles

Florida Sues OpenAI Over ChatGPT User Safety Concerns
Florida's Attorney General files lawsuit against OpenAI alleging ChatGPT can cause self-harm, cognitive decline, and behavioral addiction. What this means for AI regulation.

Google Just Redesigned the Search Box for the First Time in 25 Years
Google I/O 2026 brings the biggest search box redesign in history — multimodal inputs, AI Mode merge, and the Spark personal agent. Here's what it means for you.

Microsoft Build 2026: AI Agents Take Over Enterprise Workflows
Microsoft Build 2026 kicks off with major AI agent announcements for enterprise productivity, Copilot upgrades, and new developer tools. Here are the key takeaways.