Jais is a 13-billion parameter bilingual model developed by G42’s Inception Institute in partnership with Mohamed bin Zayed University of Artificial Intelligence and Cerebras Systems. It was trained on the Condor Galaxy AI Supercomputer with a 116-billion Arabic token and 279-billion English token dataset to bring the value of generative AI across the Arab world.
Jais’ release marks a significant milestone in the realm of AI for the Arabic world. It is a model homegrown in the UAE’s capital, Abu Dhabi, offering more than 400 million Arabic speakers the opportunity to harness the potential of generative AI. It will facilitate and expedite innovation, highlighting Abu Dhabi’s leading position as a hub for AI, innovation, culture preservation, and international collaboration.
Jais is a transformer-based large language model that incorporates many cutting-edge features, including ALiBi position embeddings, which enables the model to extrapolate to much longer inputs, providing better context handling and accuracy. Other state-of-the-art techniques include SwiGLU and maximal update parameterization to improve the model’s training efficiency and accuracy.
Jais’ training, fine-tuning, and evaluation were undertaken by an Inception/MBZUAI joint team on the Condor Galaxy 1 (CG-1), the recently announced, state-of-the-art AI supercomputer co-developed by G42 and Cerebras Systems. The 13-billion parameter open-source model was trained on a unique and purpose-built dataset of 116 billion Arabic tokens designed to capture the complexity, nuance, and richness of Arabic. It also included 279 billion English word tokens aimed at increasing the model’s performance through cross-language transfer. Inception and MBZUAI will continue to expand and refine Jais as its user community grows.
“Our strategic partnership with G42 is already delivering pioneering results. A few weeks ago, we introduced the first multi-exaFLOP AI supercomputer, Condor Galaxy 1 (CG-1). Now, the partnership delivers another key breakthrough: the leading Arabic LLM for the open-source community,” said Andrew Feldman, co-founder and CEO of Cerebras Systems. “At Cerebras our passion is building groundbreaking technology. One of the great rewards is seeing the innovative ways it is used. Jais is a significant contribution to the international open-source community. It is also a testament to how incredibly easy CG-1 is to use and how it enables extremely rapid AI model development.”
Today, Inception sits at the intersection of the academic, business, and regulatory realms to unlock synergies, foster collaboration, and accelerate the commercialization of AI across industries.