AMD's launch event for its new MI300X AI platform was backed by some of the biggest names in the industry, including Microsoft, Meta, OpenAI and many more. Those big three all said they planned to use the chip.
AMD's CEO Lisa Su bigged up MI300X, describing it as the most complex chip the company has ever produced. All told, it contains 153 billion transistors, significantly more than the 80 billion of Nvidia's all-conquering H100 GPU.
It's also built from no fewer than 12 chiplets on a mix of 5nm and 6nm nodes using what AMD says is the most advanced packaging in the world. The base layer of the chip is four big IO dies including Infinity cache, PCIe Gen 5, HBM interfaces, and Infinity Fabric. On top of that are eight CDNA 3 accelerator chiplets or «XCDs» for a total of 1.3 petaflops of raw FP16 compute power.
Either side of those stacked dies are eight HBM3 memory modules for a total of 192GB of memory. So, yeah, this thing is a monster.
Overall AMD claims MI300X has 2.4 times the memory capacity, 1.6 times the memory bandwidth and 1.3 times the raw compute power of Nvidia's H100. In actual AI training and inferencing benchmarks, AMD generally claims that MI300X is about 1.2 times faster than H100.
In reality, so long as MI300X is broadly competitive on the hardware side, the finer details of how it compares probably don't matter all that much. Because arguably more important will be the software support. Nvidia's CUDA platform is far better supported thus far than AMD's ROCm, so the latter has much to prove.
It's worth noting that Nvidia employs more software engineers than hardware engineers, which speaks volumes about where Nvidia places value and important.
All that said, AMD did emphasise that its memory capacity
Read more on pcgamer.com