Artificial Intelligence

Tech war: China eyes supercomputers for building LLMs amid US sanctions on advanced chips


Leveraging supercomputing technology that China has developed over the past decade could help break the stranglehold of US-led restrictions on the mainland’s AI industry, according to Zhang Yunquan, a researcher at the Institute of Computing Technology under the Chinese Academy of Sciences (CAS), who was quoted in a report on Monday by state-backed tabloid Global Times.
Supercomputing systems designed for training large language models (LLMs) – the technology underpinning generative AI services like ChatGPT – are crucial to replacing power-hungry, data-centre computing clusters, which typically employ from 10,000 to 100,000 graphics processing units (GPUs) for such training, Zhang said in a recent conference, according to the report.
China’s quest to establish a viable, advanced computing platform to train LLMs and develop AI applications shows the urgency of becoming technologically self-sufficient on the mainland, as its AI progress remains hindered by limited GPU choices amid US sanctions that have prevented top GPU firm Nvidia from supplying its most cutting-edge chips to the country.
More enterprises are using data centres – secure, temperature-controlled facilities that house large-capacity servers and data-storage systems – to host or manage computing infrastructure for their artificial intelligence projects. Photo: Shutterstock

“I believe that [building] LLMs are not achieved by simply adding more chips,” CAS academician Chen Runsheng said at the same conference, according to the Global Times report. “They must learn, like the human brain, to lower energy consumption, while improving their efficiency.”

Chen called on China to work on fundamental research for intelligent computing of LLMs, combined with high-performance computing (HPC) technology, to achieve breakthroughs in computing power, the report said. HPC refers to the ability to process data and perform complex calculations at high speeds, which are accomplished by supercomputers containing thousands of compute nodes that work together to complete tasks.

Engineers work at the Wuhan Supercomputer Centre in central Hubei province on May 24, 2023. Photo: AFP

The batch of LLMs that have been developed on the mainland are based on models and algorithms developed by the US, without enough consideration of fundamental theories, according to Chen. “If we can make progress in fundamental theory, we will achieve groundbreaking and authentic innovation,” Chen said.



READ SOURCE

This website uses cookies. By continuing to use this site, you accept our use of cookies.