NHN aims high with AI data center powered by Nvidia's H100 GPUs
Published: 25 Mar. 2024, 17:49
Updated: 25 Mar. 2024, 17:58
- LEE JAE-LIM
- [email protected]
GWANGJU — Across 3,200 square meters (34,445 square feet) of space in Gwangju, some 1,000 Nvidia H100 graphics processing units (GPUs) are working, rumbling and generating enough heat to make one’s throat close when passing through the racks.
NHN Cloud, one of Korea’s major cloud providers, the Gwangju city government and the Ministry of Science and ICT are behind this project to establish a two-story data center for advanced AI training, designed to offer small- and mid-sized enterprises and startups an opportunity to utilize the hotly-pursued processor in a bid to foster the burgeoning industry without imposing too great a financial burden — some, if conditions are met, can use it for free.
The facility began operating last November.
The company is jumping on the generative AI bandwagon that is driving the surging demand for data centers.
“NHN Cloud formed the largest GPU farm in Korea [from its three centers],” said the company's CEO, Kim Dong-hoon, at a press event on Thursday after giving a tour of the center. “You might wonder why this is important. If you take a look at recent articles from other global AI companies, what’s actually crucial to advancing the industry is whether you can actually build the environment for development. That is correlated to how many GPUs you have, and this resource allocation for GPUs is becoming a core aspect of AI development.”
The Gwangju city government poured 95 billion won ($70.7 million) into the center, which is capable of handling computational loads of up to 99.5 petaflops — or the ability to perform 99.5 quadrillion floating-point operations per second. Up to 107 petabytes of data can be stored here.
Of the 99.5 petaflops, 77.3 petaflops are produced by Nvidia, 11.2 petaflops come from British chipmaker Graphcore, and 11 petaflops are from Korea’s AI chip designer, Sapeon.
Sapeon’s latest X330 neural processing unit (NPU) chips are scheduled to be deployed in the Gwangju facility from August.
With the center at the heart of the industrial complex, Gwangju aims to turn the area into a hub for high-tech industries, also in line with the Korean government’s effort to decentralize data centers away from the capital area.
The press was able to get a peek into one of the two computing rooms where Nvidia’s H100, A100 and V100 GPUs were clustered.
The journalists were given earplugs for any sound sensitivity — once entering the room, black GPU racks were roaring in the mixture of machines and a cooling system both at work. The corridor temperature was lukewarm, but when passing between the decks, a wave of heat blew past to throw back this journalist’s hair.
“GPUs take a lot more to operate; they cause more noise and heat compared to a typical data center,” explained an NHN Cloud spokesperson. “The H100 is the loudest.”
The room consisted of 140 racks, each possessing a maximum power output of 15 kilowatts. The other room, which will be filled with Sapeon’s X330 chips, will consist of 120 racks with a maximum power output of 8.8 kilowatts each.
“On average, domestic data centers can produce up to 4.8 kilowatts per rack, and it goes up to 10 kilowatts for facilities that are opening between 2024 and 2025, according to a report from the Korea Data Center Efficiency Association,” said NHN Cloud Director Youn Yong-soo. “Ours far exceeds that of average.”
CEO Kim stressed the importance of maintenance control for data centers, which NHN Cloud has acquired management experience in through the maintenance of its operations in Pangyo and its Pyeongchon branch in Gyeonggi.
“Why is this important? Well, AI hardware products still have some instability, and as you may have felt when you visited data centers, they generate a lot of heat and require a strong breeze for cooling. So there are not many companies out there that can provide stable services in this area.
“Although AI companies know a lot about algorithm development and services, when it comes to controlling and optimizing hardware, they are actually quite weak.”
The cooling system within the computing room works through a wall type cooling unit (WCU), which supplies cold air to the racks bidirectionally to cool the heat-generating servers. The room temperature is kept at a stable level by separating the flow of hot and cold air so they are not mixed.
Five chillers each able to generate 1,000 kilowatts of cooling power are located on the center’s rooftop.
In case of a power outage, there are also four emergency generators that each have a capacity of 2,000 kilowatts, enough to supply the computing rooms for up to 27 hours without outside support.
Currently, the center’s infrastructure is leased by the Gwangju government to provide cheaper resources for companies utilizing the facility. When the lease expires next year, the facility will be more profitable for NHN Cloud.
“I anticipate that the infrastructure in Gwangju will be able to generate 50 billion won in annual revenue,” Kim said. “Currently, the lease price for Amazon Web Services’ cluster consisting of eight H100s is 90 million won monthly. The average market price is half of that, and since we consist of 1,000 of them, that’s roughly how much revenue we think the center will generate."
BY LEE JAE-LIM [[email protected]]
with the Korea JoongAng Daily
To write comments, please log in to one of the accounts.
Standards Board Policy (0/250자)