AMD's Instinct MI300X AI accelerators have made their first appearance at MLPerf v4.1 and have been tested with next-gen EPYC "Turin" CPUs.
Today, AMD is sharing the first performance benchmarks of its latest data center and AI-centric hardware at MLPerf Inference v4.1. These workloads are designed to showcase the potential of the latest and upcoming hardware from various tech giants such as AMD, Intel & NVIDIA.
The red team is sharing its first submissions of the Instinct MI300X accelerator at MLPerf ever since the chip was introduced while also giving us a taste of the upcoming EPYC Turin CPUs which are the 5th Gen server lineup based on the Zen 5 core architecture.
For the performance evaluation, AMD submitted the results of its Instinct MI300X AI accelerators running on a Supermicro AS-8125GS-TNMR2 system. Four results were submitted at MLPerf v4.1 with two of them under the Offline scenario and two under the Server scenario. The difference is that two of these tests were conducted using the 4th Gen EPYC "Genoa" CPUs and the other two results were conducted using the upcoming 5th Gen EPYC "Turin" CPUs.
Looking at the performance results in LLama2-70B, AMD achieved 21,028 tokens/s in the server and 23,514 tokens/s in the offline scenario running on the EPYC Genoa CPUs while 5th Gen EPYC "Turin" CPUs with the same Instinct configuration offer 22,021 tokens/s in server and 24,110 tokens/s in the offline scenario. This marks a 4.7% and 2.5% improvement over the Genoa CPU platform.
Compared to the NVIDIA H100, the Instinct MI300X is slightly slower in server performance while the difference gets larger in the offline scenario. The Turin configuration does end up faster by 2% in the server scenario but lags in the offline scenario. These results seem to match the
Read more on wccftech.com