SEOUL, Korea, Feb. 22, 2022 /PRNewswire-PRWeb/ — Neosapience, a startup that operates AI-powered virtual actor service Typecast, which uses AI to create synthetic voices and videos, today announced two new offerings that will advance the world of content development. It has introduced the first commercially available virtual actors that sing and rap in English. It also launched AI-powered video actors that connect Neosapience’s industry-leading artificial voice technology with distinct aspects of an individual’s appearance and demeanor to create a highly realistic virtual representation. These offerings elevate the potential for virtual actors by providing depth and richness that have been missing from today’s landscape. The news comes as the company disclosed it has raised $21.5 Million in Series B funding.
AI-powered voice and video technologies are in high demand as audio and video are replacing text-based media, and content providers from YouTubers to Netflix and Disney+ are striving to make their content accessible for global audiences. But, mass production of audio and video remains inefficient in terms of time and cost. This is because audio dubbing and video animation are still extremely labor-intensive tasks; current avatars and virtual humans require real human actors to perform. The cloud-based Typecast service changes all of this by turning any space into a studio. All users need is a laptop and a script or song– no expensive equipment required– opening up tremendous possibilities for content creators all over the world. Typecast is also ideally suited for anyone planning to jump into the metaverse.
A New Vision: AI-Powered Virtual Actors for Everyone
Founded by former Qualcomm engineers, Typecast was built around the idea that while AI can recognize and generate voices, it also should be able to incorporate emotional expression as well. The emotional expression and ability to fully synthesize speech is very difficult to do from a technical standpoint. Typecast has become a global leader in this arena due to its ability to generate truly natural sounding speech.
Given just a small sample of a person’s voice, the Typecast platform learns the voice features and generates speech with the same voice identity, even in a different language. A user simply chooses a voice actor or actors and uploads text. Typecast does the rest to enable anyone to control artificial beings by text script. This provides a unique solution for human actors to productize their voices as intellectual property and consumers to make highly viewed content with valuable voices.
Expanding these capabilities to singing or rap compositions, which can be even more complex, Typecast leverages voice synthesis, a type of generative modeling task in which a model is trained to imitate a human voice, similar to text-to-speech. Singing voice synthesis (SVS) is a subset of voice synthesis in which a model generates a singing voice from a text and an accompanying score, such as a piano roll or a MIDI file. If this SVS technique can be applied, even a terrible singer can become a good singer with a little help from this technology. In fact, Neosapience researchers recently proved that better quality audio can be output and produced by AI and shared the results of their findings in the academic paper MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis. This is a major achievement, given that the AI system must also incorporate and match rhythm as well as conversation style and range, plus adjust pitch as needed.
“We’ve pushed our technology to the point where virtual humans can really sing or rap with an authentic sounding AI voice. This will open all kinds of new opportunities for content owners and creators, as well as voice actors,” said Neosapience Founder and CEO Taesu Kim. “With this technology, people can quickly, easily and cost-effectively create amazing roles for artificial humans– be they singers, announcers, lecturers or a host of other possibilities.”
Another Step Forward: AI-Powered Video Characters
Similar to the ways it uniquely captures emotion and intonation in its voice technology, Typecast’s AI-powered video actors can express emotion, intonation in facial expressions and even gestures in the future. This provides a much deeper level of personalization so that AI-powered actors can be perfectly cast for new types of experiences. They are also particularly well suited for the metaverse.
Virtual video actors are poised for rapid adoption because they do not have space, cost and time constraints of human actors; they can be easily designed according to the brand values of companies and individuals. Additionally, they do not age, and they can be dropped into any environment.
The problem with such AI-powered virtual actors to date, however, has been that the technology for creators is somewhat limiting in terms of what they are capable of producing. Additionally, even basic representations have required full studios of equipment and people with a deep level of expertise, which can become cost prohibitive. Typecast, however, delivers best-of-breed technology to anyone. Users’ laptops become the studio. They construct their actor in just a few clicks and select its voice from an array of options. Then they upload their text or script to the platform, where the selected actor comes to life in video form.
“Typecast’s various AI voice actors have helped content creators all over the world unleash their creativity, but many users still have required additional tools to edit videos after our product created the audio,” said Kim. “Applying our AI technology to cast AI video characters was the next logical step for our company. Now we can make it remarkably simple for anyone to execute their stories in new ways without needing any expensive equipment or a special studio. All they need is a script, a laptop and an idea.”
Virtual actors for video are available today for all Pro and Pro Plus plan subscribers at http://www.typecast.ai.
Neosapience is a startup founded in 2017 by former Qualcomm engineers and KAIST graduates. Their goal is to make AI-powered virtual actors for everyone. The team has developed a core technology to synthesize realistic and emotional voices and faces from a given text script.
Neosapience’s Typecast service is based on these technologies to help content creators to make audio and video without casting human actors. This will lead to a revolution in the media entertainment industry as content creation becomes cheaper and easier. Learn more at http://www.neosapience.com.
Amber Moore, Moore Communications and Consulting, +1 (503) 943-9381, [email protected]