NVIDIA's Blackwell B200 GPUs incorporate a brand new architecture compared to Hopper but also consume almost twice as much power.
When NVIDIA's CEO, Jensen Huang, announced Blackwell during the GTC 2024 keynote, the reveal lacked a lot of technical and architectural information. But during the next few days of GTC, NVIDIA shared slightly more details but still without going too much into the technical deep-dives that we are all awaiting. The new details were revealed by Jonah Albe (NVIDIA SVP & GPU Architect) and Ian Buck (NVIDIA VP of Hyperscale & HPC).
To start, we all knew that Blackwell was going to be a major architectural upgrade over Hopper & it looks like it's more than that with Jonah stating that Blackwell uses a completely different micro-architecture than Hopper.
What we do know about Blackwell is that it packs the 2nd Generation of Transformer Engine technology which adds FP4 and FP6 compute formats. These formats and new software optimizations are what make Blackwell the fastest AI chip of its kind on the planet but that has taken a toll on its standard FP64 compute which has only increased by 32% versus hopper. The reasoning is plain and simple, Blackwell is an AI chip first and that's its main target market. FP64 is not that important from an AI perspective and the lower you go, the faster the inferencing and training capabilities.
Also, the reason to go the chiplet (MCM) route happens to be the need to improve overall performance rather than improving the yields. It will be interesting to see how NVIDIA's first MCM approach works in the field since we are talking about two GPUs running on the same package. It's mentioned that CUDA does a fairly good job in handling the two GPUs &
Read more on wccftech.com