Developing a first-of-its kind conversational Question and Answer (Q&A) system for te reo is the focus of a 2020 Catalyst Strategic – New Zealand Singapore Data Science Research Programme.
The programme, Natural Language Processing for Q&A in Indigenous/Vernacular Languages, is one of four successful projects that has received NZD$3million from the Ministry for Business Innovation and Employment’s Catalyst: Strategic Fund.
The programme is being led by Massey University Professor of Artificial Intelligence Ruili Wang from the School of Natural and Computational Sciences, who will work alongside a wider team of researchers including Professor of Māori and Indigenous Education Huia Jahnke and Dr Darryn Joseph from the School of Māori Knowledge. They will work in partnership with the University of Waikato’s Associate Professor Te Taka Keegan and the University of Auckland’s Professor Michael Witbrock.
The aim is to develop an intelligent conversational Q&A system. The team will address fundamental technical challenges in the areas of natural language processing for conversational Q&A, machine translation and speech recognition/synthesis for te reo Māori.
“Recent advances in data science and deep learning for natural language processing offer exciting new possibilities, however most of this success supports popular languages such as English or Chinese,” Professor Wang says.
“Because indigenous languages like te reo Māori, Fijian and Gaelic (Irish) to name a few, have fewer resources, many of them are in danger of becoming extinct, however there is increasing interest in preserving and revitalising these cultural treasures.”
He says there is currently no system that has Māori speech recognition and the ability to respond in te reo, so this research will help to inform them to develop this system. When this is developed, the system will be able to help Māori communities, government agencies and Aotearoa with efforts to continue revitalising and promoting te reo Māori.
“The system will provide a useful tool to be developed at the interface between mātauranga Māori as embedded in te reo Māori and offers a novel approach to language learning and revitalisation efforts.”
Professor Huia Jahnke says Māori language remains endangered, thus revitalisation efforts, restoration of traditional knowledge and the generation of new knowledge through te reo Māori are high priorities.
“The tool has the potential to be applied to real-world issues for example in the culturally immersive kura kaupapa Māori system of education in Aotearoa where resources remain limited but where the greatest gains in Māori education, social, cultural, economic and linguistic benefits are demonstrated.”
Massey University Provost Professor Giselle Byrnes says this funding is recognition of Massey’s leadership in partnering with international organisations to advance world-class research that makes a real difference.
“This project will significantly boost our research in artificial intelligence at Massey University. It also demonstrates that we are taking a leading role in Māori speech processing and natural language processing in New Zealand.”
“We consider that the outcomes of this programme will be of benefit and help to support the revitalisation of te reo Māori in Aotearoa.”
Massey University’s School of Natural and Computational Sciences and School of Māori Knowledge are both involved in the programme, alongside other contributors including TVNZ’s Te Karere programme which will be helping with the data collection for the project via clips and corresponding transcriptions.
The Singaporean-based research will be led by the Institute for Infocomm Research at the National University of Singapore and has received the SGD $1.2 million from the Singapore National Research Foundation to focus on Malay (the indigenous language in Singapore and Malaysia) and Singlish (the English-based creole spoken colloquially in Singapore).
Professor Wang says working with these three languages in particular will be fascinating.
“Linguistically Malay is of the same language family as Māori (Malayo-Polynesian) and knowledge of Malay is not only important for preserving the nation-state’s heritage but also for it to remain relevant in that region. The New Zealand-Singapore collaboration provides a unique opportunity to study Malay and te reo concurrently as we develop the Q&A system for both indigenous languages.”
The project is due to begin in October 2020 and is expected to conclude in December 2023.