The world of AI continues to evolve and expand, with new players entering the field every time we blink. We’ve known for some time that Meta wants to make its own language model like the one behind ChatGPT, but the company has done something a bit more exciting, at least in the broader scheme of things, with the reveal of its SeamlessM4T multimodal AI model.
To truly understand what makes the reveal of SeamlessM4T so exciting, let’s first look at what SeamlessM4T is. At its most basic level, SeamlessM4T is a multilingual multimodal AI translation and transcription model. While we have seen other models like this in the past, SeamlessM4T will allow for speech-to-text, speech-to-speech, text-to-speech, and text-to-text translations, all from a single model.
It can recognize almost 100 different languages, and speech-to-text translation is available for nearly 100 input and output languages. To put it bluntly, this model is a walking translation tool that can bridge the gap between different language speakers. What’s even more exciting than the possibilities here is how Meta is releasing this model.
Unlike ChatGPT’s model, GPT-3.5 and GPT-4.0, SeamlessM4T is completely open source, allowing researchers to pick up the code and work with it to fit their own applications. This will allow hundreds, if not thousands, of AI researchers to take the code that Meta has implemented and possibly improve it in different ways, making it even better.
“Building a universal language translator, like the fictional Babel Fish in The Hitchhiker’s Guide to the Galaxy, is challenging because existing speech-to-speech and speech-to-text systems only cover a small fraction of the world’s languages,” Meta wrote in its announcement post. Because it uses a single model instead of multiple models, Meta believes SeamlessM4T will help reduce errors and delays in translation, making it more effective.
The current state of translationary tools is very disappointing, especially considering how few languages are supported on them. So if Meta’s SeamlessM4T is as strong as the company says, it could open new doors to how we communicate with people who speak different languages, making it easier to collaborate on important research and science going forward.