Industry leaders gathered in San Francisco on January 10 for a night of networking and discussion. It was a chance to delve into the advancements of startups like ElevenLabs, who have successfully garnered millions in funding to craft their proprietary voice cloning technology. However, a fresh entrant to the scene, OpenVoice, has changed the game.
OpenVoice is the brainchild of a unique collaboration involving MIT, Tsinghua University, and MyShell, a burgeoning AI firm. MyShell announced today through their official X account the release of the OpenVoice algorithm as open-source technology. This tool provides users the ability to replicate voices with astonishing detail and customizable vocal nuances, all from a brief audio snippet.
The unveiling of OpenVoice on January 2, 2024, was accompanied by a research paper explaining its development. Trial access to this innovative voice cloning solution is made available via the MyShell web app, which requires an account, and HuggingFace, which offers public access without login requirements.
Upon conducting informal trials with the OpenVoice model on HuggingFace, I observed that the software produced a voice clone of my own voice quickly and effectively. Thanks to this cutting-edge AI, users are no longer restricted to reading from a prescribed text; spontaneous speech is enough to create a clone. Moreover, the system allows for the modification of emotional tones, such as cheerfulness or anger, delivering a distinctly human touch to the cloned voice.
The OpenVoice development team, comprising MIT and MyShell’s Zengyi Qin, Tsinghua University’s Wenliang Zhao and Xumin Yu, and Xin Sun of MyShell, provided insights into their innovative approach in their published paper. OpenVoice operates on dual AI frameworks—a text-to-speech model and a tone converter. Trained on thousands of audio samples across various languages and accents, these models are proficient in crafting highly nuanced voice clones with minimal computational resources.
Founded in Calgary, Alberta, in 2023 with a $5.6 million seed investment, MyShell has rapidly gained traction with over 400,000 users. With investments from firms like INCE Capital and Folius Ventures, this start-up is revolutionizing the AI landscape by providing a decentralized hub for AI-native applications, complete with unique AI characters, an animated GIF creator, and text-based RPGs.
While making OpenVoice open-source, MyShell retains a revenue model through subscriptions for their web app and charges for bot promotions and AI training data. VentureBeat celebrates being the digital crossroads where tech leaders converge to explore and transact in groundbreaking enterprise technology.