ElevenLabs founder and CEO Mati Staniszewski says voice is the next major interface for AI — the way people will increasingly interact with machines as models move beyond text and screens.
Speak in Web Summit in DohaStaniszewski told TechCrunch voice models like those developed by ElevenLabs recently moved beyond just mimicking human speech — including emotions and intonation — to work alongside the reasoning capabilities of large language models. The result, he says, is a change in the way people interact with technology.
In the next year, he said, “hopefully all the phones will be back in our pockets, and we can immerse ourselves in the real world around us, with the voice as the mechanism that controls the technology.”
That vision led to ElevenLabs $500 million raise this week with a value of $ 11 billion, and increasing in the AI industry. OpenAI and Google both are making sound the main focus of their next-generation models, while Apple seems to be quietly building technology next to sound and is always active through acquisitions like Q.ai. As AI spreads to wearables, cars, and other new hardware, control becomes less about tapping the screen and more about speaking, making voice a key battleground for the next phase of AI development.
Iconiq Capital general partner Seth Pierrepont echoed that view on stage at Web Summit, arguing that while screens will continue to be important for gaming and entertainment, traditional input methods like keyboards are beginning to feel “outdated.”
And as AI systems become more agentic, Pierrepont says, those interactions will also change, with models gaining the fences, integrations, and context needed to respond to the less obvious demands of users.
Staniszewski pointed to the agent change as one of the biggest changes he’s made. Instead of spelling out each instruction, he said future voice systems increasingly rely on persistent memory and context built over time, making the interaction feel more natural and requiring less effort from the user.
Techcrunch event
Boston, MA
|
June 23, 2026
That evolution, he added, will affect the way voice models are deployed. While high-quality audio models generally live in the cloud, Staniszewski said ElevenLabs is pursuing a hybrid approach that mixes cloud and on-device processing — a move aimed at supporting new hardware, including headphones and other wearables, where sound is a constant companion rather than a feature you choose to engage with.
ElevenLabs has partnered with Meta to bring voice technology to products including Instagram and Horizon Worlds, the company’s virtual reality platform. Staniszewski said he would also be open to working with Meta on Ray-Ban’s smart glasses as voice-driven interfaces evolve into new form factors.
But as voice becomes more persistent and embedded in hardware every day, it opens the door to serious issues about privacy, surveillance, and the amount of personal data that voice-based systems will store as they approach the daily life of users – something. companies like Google has been accused of torture.

