Film & Gaming Studios Using Emotionally Expressive AI-Powered Voice Technology

Text-to-speech technology was invented in 1968. It was the first time a computer was trained to robotically read words out loud. But what if that digitized voice could sound more human? What if it could express emotions? What if it could be used in movies and video games?

That’s exactly what the team behind Sonantic set out to accomplish.

Since its founding just two years ago, the UK-based startup developed a groundbreaking approach to text-to-speech that differentiates a standard robotic voice from one that sounds genuinely human. Creating that “believability” factor is at the core of Sonantic’s voice platform.

Last spring, Sonantic shared its beta version with the world: “Faith, the First AI That Can Cry.” One year later, and the company has officially launched the technology, which can now express the full spectrum of human emotions — from joy and surprise to anger and sadness.

In order to demonstrate its voice-on-demand technology, Sonantic has released a demo video highlighting its partnership with Obsidian, a AAA gaming studio and subsidiary of Xbox Game Studios.

“Sonantic’s voice-on-demand technology is unlike anything I’ve ever seen in my career,” commented Obsidian Entertainment Audio Director Justin E. Bell whose production timelines and associated costs have been slashed through this new capability. “Working in game development, we could send a script through Sonantic’s API — and what we would get back is no longer just robotic dialogue: it is human conversation. This technology can empower our creative process and ultimately help us to tell our story.”

Sonantic partners with experienced actors to create its voice models. Clients can choose from existing voice models or work with Sonantic to build custom voices for unique characters. Project scripts are then uploaded to Sonantic’s platform, where a client’s audio team can choose from a variety of high fidelity speech synthesis options including pitch, pacing, projection and an array of emotions.

Gaming and film studios are not the only beneficiaries of Sonantic’s platform. Actors can maximize both their time and talent by turning their voices into a scalable asset. Sonantic’s revenue share model empowers actors to generate passive income every time his or her voice model is used for a client’s project, spanning development, pre-production, production and post-production.

Backed by the likes of EQT Ventures, Krafton Ventures and Kevin Lin (co-founder of Twitch), Sonantic counts 200+ top-tier film and gaming studios as clients, with 1,000 more on a waitlist. In addition to Obsidian Entertainment, Sonantic has partnered with Splash Damage, Sumo Digital, 4A Games, Embark, Red Thread Games and Fatshark, amongst others.

Sonantic’s co-founders Zeena Qureshi (CEO) and John Flynn (CTO) are committed to ensuring that both studios and actors benefit from their technology, which they liken to the audio version of CGI animation. They predict that in five years time, artificial voices will be the new normal for the entertainment industry.

“Hearing that cry was an incredible moment for our team last year. Today, we’re proud of this major milestone for our company: AI-powered voice models that can express the full spectrum of human emotion,” said Flynn.

Qureshi added, “We’re excited to offer studios the opportunity to produce work in an unprecedented timeframe, and for our actors to maximize their time, talent and earning potential.”