The AMD-powered Frontier Supercomputer with Instinct MI250X GPUs has achieved a 1 Trillion Parameter LLM run, rivaling ChatGPT-4.
The Frontier supercomputer is the world's leading supercomputer and the only Exascale machine that is currently operating. This machine is powered by AMD's EPYC & Instinct hardware which not only offers the top HPC performance but is also the 2nd most efficient supercomputer on the planet. A submission report on Arxiv by individuals has revealed that the Frontier supercomputer has reached the ability to train one trillion parameters through "hyperparameter tuning", setting a new industry benchmark.
This is 'just' 3k of the 37k MI250X on Frontier. To steal from Feynman, "There is plenty of room at the top!" -- I'm expecting lots of interesting work scaling out further to tens of thousands of nodes.
— Nicholas Malaya (@nicholasmalaya) January 5, 2024
Before we go into the crux, let's take a quick recap on what the Frontier supercomputer holds. The supercomputer by ORNL has been designed from the ground up with AMD's 3rd Gen EPYC Trento CPUs and Instinct MI250X GPU accelerators. It is installed at the Oak Ridge National Laboratory (ORNL) in Tennessee, USA, where it is operated by the Department of Energy (DOE). It currently has achieved 1.194 Exaflop/s using 8,699,904 cores. The HPE Cray EX architecture combines 3rd Gen AMD EPYC CPUs optimized for HPC and AI, with AMD Instinct 250X accelerators and a Slingshot-11 interconnect. Frontier has been able to maintain the number one spot on the Top500.org list of supercomputers, showing its dominance.
The new records achieved by Frontier are a result of implementing effective strategies to train LLMs and use the onboard hardware most efficiently. The team
Read more on wccftech.com