NVIDIA has officially unveiled its next-gen Blackwell GPU architecture which features up to a 5x performance increase versus Hopper H100 GPUs.
NVIDIA Blackwell GPUs Feature 5x Faster AI Performance Than Hopper H100, Leading The Charge of Next-Gen AI Computing
NVIDIA has gone official with the full details of its next-generation AI & Tensor Core GPU architecture codenamed Blackwell. As expected, the Blackwell GPUs are the first to feature NVIDIA's first MCM design which will incorporate two GPUs on the same die.
Related Story NVIDIA Superchip Expanded To Blackwell GPUs: GB200 Grace Blackwell Superchip With 40 PFLOPs AI, 864 GB Memory
- World’s Most Powerful Chip — Packed with 208 billion transistors, Blackwell-architecture GPUs are manufactured using a custom-built 4NP TSMC process with two-reticle limit GPU dies connected by 10 TB/second chip-to-chip link into a single, unified GPU.
- Second-Generation Transformer Engine — Fueled by new micro-tensor scaling support and NVIDIA’s advanced dynamic range management algorithms integrated into NVIDIA TensorRT™-LLM and NeMo Megatron frameworks, Blackwell will support double the compute and model sizes with new 4-bit floating point AI inference capabilities.
- Fifth-Generation NVLink — To accelerate performance for multitrillion-parameter and mixture-of-experts AI models, the latest iteration of NVIDIA NVLink® delivers groundbreaking 1.8TB/s bidirectional throughput per GPU, ensuring seamless high-speed communication among up to 576 GPUs for the most complex LLMs.
- RAS Engine — Blackwell-powered GPUs include a dedicated engine for reliability, availability and serviceability. Additionally, the Blackwell architecture adds capabilities at the chip level to utilize AI-based preventative maintenance to run diagnostics and forecast reliability issues. This maximizes system uptime and improves resiliency for massive-scale AI deployments to run uninterrupted for weeks or even months at a time and to reduce operating costs.
- Secure AI —
Read more on wccftech.com