The goal of a new artificial intelligence large language model for Arabic created in Abu Dhabi is to mainstream one of the most widely spoken languages in the world. The Mohammed bin Zayed University of Artificial Intelligence, Silicon Valley-based Cerebras Systems, and Inception, a division of Abu Dhabi AI business G42, produced Jais, an open-source bilingual Arabic-English model.
According to the organizations, it is more accurate than other LLMs for Arabic currently in use, such as Falcon from Abu Dhabi’s Technology Innovation Institute, Llama 2 from Facebook parent Meta Platforms, and Bloom from machine learning platform Hugging Face’s BigScience Workshop.
According to Andrew Jackson, chief executive of Inception, the introduction of Jais is a further step in promoting the scientific and computer sectors’ increased attention to non-English LLMs, comparable to those made in Japan and India.
The Ministry of Foreign Affairs, the Ministry of Industry and Advanced Technology, the Department of Health – Abu Dhabi, ADNOC, Etihad Airways, FAB, and e&, the technology conglomerate formerly known as Etisalat, are among the public and private organizations in the UAE that have signed on as Jais launch partners. Using 116 billion Arabic tokens and 279 billion English tokens, Jais is trained on the Condor Galaxy, dubbed the “world’s largest AI supercomputer” and unveiled in July by G42 and Cerebras.
The fundamental text or coding unit used to process and generate language and other portions of the code, tokens serve as the language’s building blocks for an LLM. According to WorldData, more than 400 million people speak Arabic, making it one of the most widely spoken languages in the world.
Jais, according to Mr. Jackson, would assist in increasing this number. “We’re in charge of a project to gather more Arabic data from offline sources. As a result, this has already begun in earnest, and this is the first strategy we’ll use to promote Arabic, according to Mr. Jackson.
We are also investigating fresh approaches to synthesize Arabic, translate existing Arabic into other languages, and enhance Arabic conversion. We have a long way to go, but I believe we need to be really hopeful and make significant progress. Although businesses have traditionally employed AI, generative AI, made famous by Microsoft-backed OpenAI’s ChatGPT, has given it a huge boost.
Companies’ efforts would be aided by the availability of LLMs, especially as developers continue to advance the capabilities of AI. “Speed performance is important to developers,” Mr. Jackson said. “It allows data scientists and ML researchers to quickly bring up and iterate on different models. It also lets them bring new models to the community, into production, or to the market more quickly.”
Although businesses have traditionally employed AI, generative AI, made famous by Microsoft-backed OpenAI’s ChatGPT, has given it a huge boost.
“Speed performance is important to developers,” Mr. Jackson said. “It allows data scientists and ML researchers to quickly bring up and iterate on different models. It also lets them bring new models to the community, into production, or to the market more quickly.”