First Languages AI Reality

The First Languages AI Reality (FLAIR) initiative enables the next chapter in Indigenous language revitalization with artificial intelligence (AI) and immersive technology. 

Logo of the project and photo of an Indigenous mother talking with her child.

Background

More than 50% of the world’s languages will become extinct or seriously endangered by 2100. The extinction of a language results in the irrecoverable loss of unique cultural, historical, and ecological knowledge. Since each language is a unique expression of the human experience of the world, the knowledge it encodes may be the key to answering fundamental questions of the future.

The majority of the languages that are under threat are Indigenous languages. More than 4,000 languages are spoken by Indigenous peoples worldwide, who represent less than 6% of the global population. It is estimated that one Indigenous language dies every two weeks. Languages are central to the identity of Indigenous peoples, the preservation of their cultures, worldviews and visions and an expression of self-determination.

Vision

We envision a world in which Indigenous communities have full self-determination and sovereignty over their language and culture. We imagine technology serving the revitalization and thriving of languages as a tool to connect community members, to celebrate their identity, and to transmit their culture and knowledge on the community’s own terms.

Project Description

The FLAIR initiative serves Indigenous communities in their efforts to revitalise their language and culture through technology. 

We are building the foundations for Indigenous Voice AI in systems explicitly designed to respect data sovereignty and linguistic self-determination. Our foundational automatic speech recognition (ASR) research aims to develop a method for the rapid creation of custom models for endangered languages. These models can be used for language learning, audio transcription, voice-controlled technology and much more. Furthermore, Voice AI will position Indigenous communities to participate in the Metaverse in their heritage languages and facilitate intergenerational language transmission, a critical factor for language vitality. It will enable inclusive immersive experiences in which Indigenous youth can reconnect with their heritage in culturally meaningful exchanges and activities. 

We are proposing a multifaceted approach that aims to reduce the data requirements drastically. The development of ASR for a new language typically requires hundreds of hours of data. For most Indigenous languages, this is usually infeasible due to the limited or non-existent audio recordings and because there are very few remaining speakers. In many cases, there are barely a dozen or no native speakers left (who tend to be of advanced age) and large data collection from these speakers is not realistic. When recorded audio exists, it is either not transcribed or inaccessible. Thus, a method to decrease the number of hours of audio data required is critical to unlocking the potential of AI for low-resource languages. 

The immediate focus of FLAIR is to validate solutions for a specific set of Indigenous languages in North America. All our learning and the tools we build will be shared publicly and open-source for free use. We will next scale up to deliver the resulting system for rapid ASR development to Indigenous communities worldwide as it could help solve similar problems for the thousands of languages used by other under-resourced/underserved communities. 


Why first languages AI can be a reality 

Watch FLAIR’s Technical Director, Michael Running Wolf, present his vision for the project at a TEDx event held in Boston.

Resources

Atlas of the World's Languages in Danger
The interactive version of the UNESCO Atlas of the World’s Languages in Danger lists approximately 2,500 languages threatened with extinction.
Could AI help save Indigenous languages?
After growing up with the sounds of Cheyenne, software engineer Michael Running Wolf is now harnessing AI to ensure this language's survival, and others.
The United Nations Permanent Forum on Indigenous Issues
Indigenous languages are not only methods of communication, but also extensive and complex systems of knowledge that have developed over millennia.

In the Media

New technology for Indigenous languages (ITC News)
Other Members
Gigi Davidson
Shankhalika Srikanth
David Huggins-Haines
Farhan Shaikh
Josée Poirier (Mila)
Faith Baca (Indigenous Student)
Ryan Conti (Indigenous Student)
Daniela Ramos Ojeda (Indigenous Student)
Belu Ticona (Indigenous Student)
Dave Malenfant (Alumni)
Kyran Romero (Alumni)

Partners

Have questions about the project?