📢 Exclusive on Gate Square — #PROVE Creative Contest# is Now Live!
CandyDrop × Succinct (PROVE) — Trade to share 200,000 PROVE 👉 https://www.gate.com/announcements/article/46469
Futures Lucky Draw Challenge: Guaranteed 1 PROVE Airdrop per User 👉 https://www.gate.com/announcements/article/46491
🎁 Endless creativity · Rewards keep coming — Post to share 300 PROVE!
📅 Event PeriodAugust 12, 2025, 04:00 – August 17, 2025, 16:00 UTC
📌 How to Participate
1.Publish original content on Gate Square related to PROVE or the above activities (minimum 100 words; any format: analysis, tutorial, creativ
Frontier, the world's largest supercomputer, uses 3,072 AMD GPUs to train over a trillion parameter LLMs
Bit News According to a report by New Zhiyuan on January 13, AMD's software and hardware systems can also train GPT-3.5 level large models.
Frontier, the world's largest supercomputer at Oak Ridge National Laboratory, is home to 37,888 MI250X GPUs and 9,472 Epyc7A53CPUs. Recently, researchers trained a GPT-3.5-scale model using only about 8% of those GPUs. The researchers successfully used the ROCM software platform to successfully break through many difficulties of distributed training models on AMD hardware, and established the most advanced distributed training algorithm and framework for large models on AMD hardware using the ROCM platform.
Successfully provides a feasible technical framework for efficient training of LLMs on non-NVIDIA and non-CUDA platforms.
After the training, the researchers summarized the experience of training large models on Frontier into a paper detailing the challenges encountered and overcome.