Meta announced that it has built and open-sourced ‘No Language Left Behind’ NLLB-200, a single AI model that is the first to translate across 200 different languages, including 55 African languages with state-of-the-art results. Meta is using the modelling techniques and learnings from the project to improve and extend translations on Facebook, Instagram, and Wikipedia.
In an effort to develop high-quality machine translation capabilities for most of the world’s low-resource languages, this single AI model was designed with a focus on African languages. They are challenging from a machine translation perspective. AI models require lots and lots of data to help them learn, and there’s not a lot of human translated training data for these languages.
The work has been done with professional translators for each of these languages to develop a reliable benchmark which can automatically assess translation quality for many low-resource languages. Professional translators were also called on to do human evaluation too, meaning people who speak the languages natively evaluate what the AI produced.
“It’s impressive how much AI is improving all of our services. We just open-sourced an AI model we built that can translate across 200 different languages — many of which aren’t supported by current translation systems. We call this project No Language Left Behind, and the AI modelling techniques we used are helping make high quality translations for languages spoken by billions of people around the world” said Meta CEO Mark Zuckerberg in a post on his Facebook profile.
“The advances here will enable more than 25 billion translations every day across our apps. Communicating across languages is one superpower that AI provides, but as we keep advancing our AI work it’s improving everything we do — from showing the most interesting content on Facebook and Instagram, to recommending more relevant ads, to keeping our services safe for everyone,” he added.
“Africa is a continent with very high linguistic diversity, and language barriers exist day to day. We are pleased to announce that 55 African languages will be included in this machine translation research, making it a major breakthrough for our continent,” Balkissa Ide Siddo, Public Policy Director for Africa said while speaking about the launch of the AI model.
“In the future, imagine visiting your favourite Facebook group, coming across a post in Igbo or Luganda, and being able to understand it in your own language with just a click of a button – that’s where we hope research like this leads us. Highly accurate translations in more languages could also help to spot harmful content and misinformation, protect election integrity, and curb instances of online sexual exploitation and human trafficking.”
To confirm that the translations are high quality, Meta also created a new evaluation dataset, FLORES-200, and measured NLLB-200’s performance in each language. Results revealed that NLLB-200 exceeds the previous state of the art by an average of 44 percent.
Meta is also open-sourcing the NLLB-200 model and publishing a slew of research tools to enable other researchers to extend this work to more languages and build more inclusive technologies. Meta AI is also providing up to $200,000 of grants to non-profit organizations for real world applications for NLLB-200.
To explore a demo of NLLB-200 showing how the model can translate stories from around the world, visit here. You can also read the research paper here.