Auto translate speech8/7/2023 The model uses S2UT to convert input speech to a sequence of acoustic units directly in the path, an implementation Meta previously pioneered. How Meta’s universal speech translator (UST) works Enabling those to be multicast in multiple languages expands the potential audience on a massive scale,” he said. “Soon, we can look forward to hosting podcasts, Reddit AMA, or Clubhouse-like experiences within the metaverse. These tools can benefit significantly from speech translation capabilities. In addition, using artificial speech translation for content allows you to easily localize content for consumption in multiple languages,” Falcon told VentureBeat.įalcon believes that a confluence of factors, such as the pandemic having massively increased the amount of remote work, as well as reliance on remote working tools, have led to growth in this area. “For interactions, it will enable people from around the world to communicate with each other more fluidly, making the social graph more interconnected. William Falcon, AI researcher and CEO/cofounder of Lightning AI, said that artificial speech translation could play a significant role in the metaverse as it helps stimulate interactions and content creation. Meta AI says it will also release a speech-to-speech translation benchmark set to facilitate future research in this field. “They then added the paired sentences to the data used to train the AI model.” Meta AI’s Mark Zuckerberg demonstrates the company’s speech-to-speech AI translation model.įor the modeling, Meta AI applied recent advances in using self-supervised discrete representations as targets for prediction in speech-to-speech translation, and demonstrated the effectiveness of leveraging additional text supervision from Mandarin, a language similar to Hokkien, in model training. “Our team first translated English or Hokkien speech to Mandarin text, and then translated it to Hokkien or English,” said Juan Pino, researcher at Meta. The team focused on creating human-annotated data, automatically mining data from large unlabeled speech datasets, and adopting pseudo-labeling to produce weakly supervised data. Meta AI’s research team worked on Hokkien as a case study for an end-to-end solution, from training data collection and modeling choices to benchmarking datasets. And it sought new ways to evaluate and improve on its results. It addressed the modeling challenges that arise as models grow to serve many more languages. It addressed data scarcity by acquiring more training data in more languages and finding new ways to leverage the data already available. To build UST, Meta AI focused on overcoming three critical translation system challenges. That event focused on using such immersive AI technologies for building the metaverse. The UST project builds upon the progress Zuckerberg shared during the company’s AI Inside the Lab event held back in February, about Meta AI’s universal speech-to-speech translation research for languages that are uncommon online. Meta says that today’s AI translation models are focused on widely-spoken written languages, and that more than 40% of primarily oral languages are not covered by such translation technologies. How AI can tackle speech-to-speech translation This is a difficult task since, unlike Mandarin, English, and Spanish, which are both written and oral, Hokkien is predominantly verbal. The system allows Hokkien speakers to hold conversations with English speakers, a significant step toward breaking down the global language barrier and bringing people together wherever they are located - even in the metaverse. According to Meta, the model is the first AI-powered speech translation system for the unwritten language Hokkien, a Chinese language spoken in southeastern China and Taiwan and by many in the Chinese diaspora around the world.
0 Comments
Leave a Reply.AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |