WHEN United States tech firm, OpenAI, rolled out Whisper, a speech recognition tool offering audio transcription and translation into English for dozens of languages including Maori, it rang alarm bells for many Indigenous New Zealanders.
Whisper, launched in September by the company behind the ChatGPT chatbot, was trained on 680,000 hours of audio from the web, including 1,381 hours of the Maori language.
Indigenous tech and culture experts say that while such technologies can help preserve and revive their languages, harvesting their data without consent risks abuse, distorting Indigenous culture, and depriving minorities of their rights.
"Data is like our land and natural resources," said Karaitiana Taiuru, a Maori ethicist and an honorary academic at the University of Auckland.
"If Indigenous peoples don't have sovereignty of their data, they will simply be re-colonised in this information society."
OpenAI said, it collaborated "with industry leaders and policymakers to ensure that AI systems are developed in a trustworthy manner" in a statement on its website.
Generative artificial intelligence (AI) that learns from mass data sets typically scraped from the web to create text, images, videos and more, has found a wide range of applications from marketing to education to law.
But alongside, there are growing concerns about plagiarism, unethical sourcing of data, and cultural appropriation.
This is especially true of Indigenous communities that have a long history of their culture being stolen and appropriated, said Michael Running Wolf, an AI ethicist and Native American who founded the non-profit Indigenous in AI.
"There is a huge commercial incentive to collect our language data for applications like voice AI and large language models. Some large datasets have Indigenous data with unexplained origins.
"Having Indigenous data sovereignty is critical as it allows communities to protect knowledge that is sacred or deeply sensitive, and which may have commercial value, from exploitation," Running Wolf told the Thomson Reuters Foundation.
Many Indigenous languages are under threat of disappearing, the United Nations has warned, taking with them cultures, knowledge and traditions.
In New Zealand, where Maori is enjoying a revival, the government aims to have one million basic speakers by 2040.
That means digital systems using Maori will be rolled out in increasing numbers, said Peter-Lucas Jones, chief executive of Te Hiku Media, a non-profit that runs Maori broadcasts and also archives and promotes the language.
"The development of tools that use generative AI can absolutely assist with the revitalisation and reclamation of Indigenous languages and cultures," said Jones.
But, it was "concerning" to see a non-Maori organisation roll out a speech model using their language, he said.
"What we are seeing with these large AI models is that data is being scraped from the Internet with little regard for any bias that could be present in the data, let alone any associated intellectual property rights," he said.
Indigenous leaders were angered when Air New Zealand in 2019 sought to trademark a logo with the words "kia ora" — meaning "hello" or "good health" — highlighting tensions over attempts to co-opt Maori language and culture by outside groups.
Now, there are questions about intellectual property rights over data scraped from the web for use by AI, a legal grey area.
A group of visual artists sued AI artwork generation companies Stability AI, Midjourney and DeviantArt in January for copyright infringement by creating images in their style.
Stability AI has said its work is protected by the fair use doctrine that allows limited use of copyrighted material.
Critics warn Indigenous groups — who are generally not involved in the design or testing of AI systems — are at risk from bias that can be embedded within algorithms, while generative AI models may also spread incorrect information.
"There are real risks that generative technologies could teach false Indigenous histories and stories, create and re-create biases and make it impossible for Indigenous peoples to reclaim sovereignty of their data," said Maori ethicist Taiuru.
There is growing recognition of the need to protect Indigenous data and knowledge, with the World Trade Organisation outlining measures in 2006 to provide intellectual property protection for "traditional knowledge and folklore".
The writer is from the Reuters news agency