A Japanese Language Model for Generative AI
A team of researchers from the Tokyo Institute of Technology, Fujitsu Ltd., and others have developed a large language model (LLM) called Fugaku-LLM, specifically designed for generative artificial intelligence (AI) applications. This model is unique in its focus on the Japanese language, with 60% of its training data consisting of Japanese text.
Fugaku-LLM was developed using the Fugaku supercomputer, a joint project between Fujitsu and the government-backed research institute Riken. This supercomputer's high performance enabled the researchers to train the model on a massive dataset, resulting in exceptional Japanese language capabilities.
One of the key advantages of Fugaku-LLM is its transparency and safety. Unlike other Japanese language models that rely on continual learning, Fugaku-LLM was trained from scratch using a curated dataset that excludes harmful content. This allows for a clear understanding of the model's learning process and ensures its responsible use.
Another significant aspect of Fugaku-LLM is its reliance on central processing units (CPUs) instead of graphics processing units (GPUs) for training. This is a departure from the norm in language model development, where GPUs are typically preferred due to their superior processing power. However, the researchers successfully optimized Fugaku's communication performance to achieve efficient training with CPUs, overcoming the limitations of GPU scarcity.
Professor Rio Yokota of the Tokyo Institute of Technology emphasized the team's ability to overcome challenges and achieve self-reliance in developing Fugaku-LLM. The model's source code has been made publicly available on Fujitsu's website, further promoting transparency and collaboration in the field of AI research.
Fugaku-LLM is expected to play a significant role in advancing generative AI research tailored to the needs of Japan. Its high Japanese language proficiency and emphasis on safety and transparency make it a valuable tool for developing responsible and culturally relevant AI applications.
0 Comments
Name
Comment Text