KUALA LUMPUR: The Science, Technology and Innovation Ministry is exploring the possibility of developing a Malay-language version of the Large Language Model (LLM) through collaborations with local universities and companies.
Deputy Minister Datuk Mohammad Yusof Apdal said the LLM would be developed entirely using local datasets, enabling it to better reflect the nuances of the Malay language and culture.
He added that the ministry is also exploring the development of LLM with local company Mesolitica Sdn Bhd (Mesolitica) which has the expertise in developing its own LLM called Malaysia Large Language Model (MaLLaM).
"Hence, MIMOS Bhd, an agency under the ministry alongside Mesolitica are actively working together to enhance the capability and effectiveness of this model.
"Future collaborations will also involve Dewan Bahasa dan Pustaka (DBP) to enrich the Malay language further," he said during the minister's question time in Dewan Rakyat, today.
Yusof said this in response to Lee Chean Chung (PH-Petaling Jaya) on the ministry's plans to develop its own LLM and steps taken to ensure that local values and culture are not overlooked.
Commenting further, Yusof added that the LLM development will support the local AI ecosystem and reduce dependency on foreign AI technologies.
He also said the LLM can enhance decision-making processes and facilitate research across various sectors by adapting global AI knowledge to local needs.
He, however, said the model involves significant costs and requires specialised equipment which is often implemented in cloud-based High-Performance Computing (HPC) centres.
"Initial hardware investment costs can be reduced by utilising services from cloud providers such as Microsoft Azure, OpenAI and Amazon Web Services.
"However, there are concerns about data security as data stored using these cloud services would be kept overseas and has a risk of sensitive data breaches.
"Hence, the ministry through MIMOS, is currently exploring the use of cost-effective computing infrastructure that can securely process and refine proprietary data to address the issue."