KAIST researchers say new GPU tech doubles AI service speed

Home > Business > Tech

KAIST researchers say new GPU tech doubles AI service speed

Published: 08 Jul. 2024, 19:34

LEE JAE-LIM
lee.jaelim@joongang.co.kr

A GPU deployed with the Compute Express Link (CXL) developed by KAIST [KAIST]

A group of researchers from KAIST and chip design startup Panmnesia succeeded in more than doubling a GPU's memory processing speed with the goal of addressing the memory limitations of advanced GPUs that are used to train AI.

The team, led by Professor Jung Myoung-soo, says its Compute Express Link (CXL) technology can operate AI services 2.36 times faster than existing GPU memory expansion technologies, KAIST said Monday.

CXL is a protocol that enables high-speed and high-capacity transfer between processors, including GPUs, and memory. The company's CXL-Enabled AI Accelerator was unveiled at CES 2024 in January, while its new CXL-GPU was unveiled last week.

The research will be presented at the upcoming USENIX Annual Technical Conference in Santa Clara, which begins Wednesday.

Companies commonly train AI systems on multiple GPUs to access the memory required, significantly increasing the cost of developing any given model.

CXL, however, enables the GPU to access external memory in the same way it accesses internal memory, significantly increasing the capacity available in a more cost-effective manner. It essentially allows CPUs and GPUs to “share” memory without copying or moving data.

The KAIST team noticed performance slowdowns in CXL-GPU devices when processing data from memory chips. Therefore, the team developed a technology where the memory chips can autonomously read stored data without having to wait for the main processor to complete the transaction and can also retrieve data from the GPU more quickly.

“This technology can accelerate the market opening of CXL-GPU, significantly reducing the memory expansion costs for Big Tech companies operating large-scale AI services,” Jung said in a statement.

The successful adoption of this technology, however, will depend on how easily Panmnesia's solution can be integrated into existing hardware and whether GPU developers are on board with the standard.

BY LEE JAE-LIM [lee.jaelim@joongang.co.kr]