[Press Release]: In Nscale's latest technical deep dive, we explore a critical aspect of AI model optimization: throughput benchmarking, performance tuning, and latency reduction using GEMM (General Matrix Multiplication) tuning.

Related Story AMD Ryzen 9000 “Zen 5” CPUs Listed Online At Much Lower Prices Than Ryzen 7000 MSRPs – 9950X €659, 9900X €499, 9700X €399, 9600X €309

Maximizing the performance of GPU-accelerated tasks involves more than just raw speed. Optimizing GEMM ensures efficient processing, higher throughput, and the ability to handle complex models and datasets effectively.

In this blog, we will explore the benchmarking of vLLM throughput across multiple models and delve into the significant impact of GEMM tuning. Powerful libraries such as rocBLAS (ROCm Basic Linear Algebra Subprograms) and hipBLASlt (Heterogeneous-Compute Interface for Portability, Basic Linear Algebra Subprograms) are instrumental in this process.

These libraries provide optimized implementations of GEMM operations along with a range of tuning parameters, allowing developers to fine-tune their applications and unlock the full potential of their underlying hardware, ultimately maximizing vLLM performance.

What is GEMM Tuning?

GEMM tuning is a powerful technique for enhancing the performance of matrix-multiplication operations. This process includes selecting the most appropriate algorithm based on factors such as memory, cache, and compute capabilities."

By fine-tuning parameters and selecting optimal algorithms, we ensure the GEMM operation maximizes efficiency in using available computing resources. This translates to significant speed improvements for AI and machine learning models.

Metrics Compared

Our analysis compared several key

Read more on wccftech.com

All news from wccftech.com

About this in other media

Mother’s Instinct Trailer Previews Neon Thriller Starring Anne Hathaway & Jessica Chastain comingsoon.net /7 months ago

Unreal Engine 5 Fantasy Ruins Tech Demo Annihilates Performance With Lumen and Nanite at 1080p On an RTX 4080 wccftech.com /7 months ago

Intel Core Ultra 7 258V “Lunar Lake” CPU Early Tests Reveal Arc 140V “Xe2” iGPU With Performance On Par GTX 1650 wccftech.com /7 months ago

The website gamebastion.com is an aggregator of news from open sources. The source is indicated at the beginning and at the end of the announcement. You can send a complaint on the news if you find it unreliable.

25.07 / 01:14

Platform UPS AMD Assurant AMD Ryzen 9000 “Zen 5” Desktop CPU Launch Pushed Back To 15th August, QA Cited As Main Reason

AMD has a new update on its Ryzen 9000 "Zen 5" Desktop CPU launch which has been pushed back to 15th of August (sales windows).

25.07 / 01:13

War New Gigantic Grrloc Mount Datamined in the War Within Beta - Likely In-Game Shop Mount

25.07 / 01:10

UPS Reddit Elden Ring Player Wins Once In A Lifetime Rune Lottery In A Very Unexpected Way

25.07 / 01:06

Platform Final Bosses of Mythic Dragonflight Raids Bugged (and Tindral)

25.07 / 00:58

Fighting Adventure UPS Nintendo Shadows of the Damned: Hella Remastered Gets Official Release Date

25.07 / 00:56

UPS Provident Zenless Zone Zero: Best Soukaku Build (W-Engine, Teams & Drive Discs)

25.07 / 00:36

PC Xbox Series X Core Keeper Will be Available at Launch on Game Pass

25.07 / 00:29

Star Trek Chris Hemsworth to Star in Apple Sci-Fi Mystery Movie The Corsair Code

25.07 / 00:04

The Boy and the Heron Max Release Date Set for Streaming Debut

24.07 / 23:57

Platform New MultiVersus Datamine Suggests Future Content Around Barbie, Harry Potter, and More

24.07 / 23:51

Keanu Reeves Reflects on The Matrix 25 Years Later: ‘It Changed My Life’

24.07 / 23:47

Party Action Adventure UPS Provident D&D’s 2024 Player’s Handbook Is Making A Big Mistake With Martial & Spellcasting Classes

24.07 / 23:23

Fighting Pokémon GO Togedemaru: Shiny Availability, Typing & Moves

24.07 / 23:22

Racing Marvell Sony Reddit Leak Hints At The Return Of An Underrated PlayStation Exclusive

24.07 / 23:22

UPS Provident Ark: Survival Ascended Mutations Mod Adds 40 Wacky Variations That Completely Change The Game

24.07 / 23:19

UPS Nintendo 8BitDo’s Famicom-inspired keyboard is on sale for its lowest price yet

24.07 / 23:17

War Blizzard Confirms That Current Class Trees on Pre-Patch Are Intended

24.07 / 23:05

Molten Core Heat 3 Loot Clarifications - Additional Drops, But No Increase in Power Level

24.07 / 22:59

Warcraft Action Progressive War Celebrity World of Warcraft’s new character-select screen made me emotional

24.07 / 22:49

War Blizzard Responds to Mistweaver Monk Feedback - Life Cocoon Scaling

24.07 / 22:49

Action UPS New PS5 beta update makes roommate-friendly changes to Remote Play, audio

24.07 / 22:44

Fighting Warcraft UPS Progressive Turn Up the Heat in Season of Discovery: Molten Core

24.07 / 22:42

Provident Discover AMD AMD delays Ryzen 9000-series retail launch by a couple of weeks after discovering an issue with the first batch of chips

24.07 / 22:34

UPS Dungeons and Kingdoms is an upcoming game that's about exactly what it sounds like: dungeon crawling and kingdom management

24.07 / 22:33

UPS Sony Sony hero shooter Concord will not have a battle pass: 'You own Concord, Concord doesn't own you'

24.07 / 22:32

Warcraft UPS Cooper Blizzard's World of Warcraft team has unionized

24.07 / 22:30

PC Xbox One Xbox Series X PS4 PS5 Apex Legends Season 22 Will Restore the Ability to Purchase Battle Passes with Apex Coins

24.07 / 22:26

Warcraft Fallout Assurant World of Warcraft workers unlock 'form a union' achievement

24.07 / 22:12

The World of Warcraft development team has voted to unionise

24.07 / 22:11

PlayStation 4 Xbox One EA cancels unpopular Apex Legends battle pass change following outcry

24.07 / 22:11

PC Xbox Series X PS5 The Casting of Frank Stone Showcases 7 Minutes of New Gameplay

24.07 / 22:09

Fighting Warcraft UPS Progressive Molten Core Heat Levels Explained - Season of Discovery

24.07 / 22:06

Lawbringer Mercy Tier 1 Holy Paladin 2-Piece Set Bonus Nerfed - Season of Discovery PTR

24.07 / 21:57

Industry Season 3 Trailer Previews HBO Drama’s Return With Kit Harington

24.07 / 21:45

UPS Extreme NVIDIA This Alienware gaming laptop with RTX 4090 is $600 off

24.07 / 21:43

War How To Obtain Recolors of the Stormrider's Attire Transmog from Heroic The War Within Edition

24.07 / 21:32

Discover War Shadowlands and Battle for Azeroth Legacy Raids Harder with War Within Pre-Patch

24.07 / 21:31

Warcraft Assurant Blizzard Developers Form the 'World of Warcraft Gamemakers Guild' Union

24.07 / 21:17

Fallout UPS 10 Fun Fallout 4 Easter Eggs You Might Have Missed

24.07 / 21:11

Party RPG Racing Discover Starfield Player Discovers Space Party After 300 Hours In-Game

24.07 / 21:05

UPS Celebrity Pokémon TCG Reveals New Mew Vs. Pikachu Card Ahead Of Worlds

24.07 / 21:01

Fighting Tekken 8 Interview: Why Harada Lied to Us and How Heihachi Found His Way Back to Tekken 8

24.07 / 20:58

UPS shooting MK1 Takeda Character Guide (Moves, Kombos, & Fatalities)

24.07 / 20:57

football EA Sports College Football 25's Road To Glory Mode Has One Major Problem

24.07 / 20:54

As cries for Dino Crisis and Mega Man revivals intensify, Capcom says it has to think about the "gameplay and specific appeal that an IP holds"

24.07 / 20:53

72,632 'Overwhelmingly Negative' reviews later, EA walks back controversial Apex Legends battle pass changes: "You've spoken, and we've listened"

24.07 / 20:47

TikTok Half-Life The 'Skibidi Toilet multiverse' is 'absolutely in talks' to become a TV and movie series from Michael Bay

24.07 / 20:35

UPS Twitter PlayStation's new online FPS won't have a battle pass because "you own Concord, Concord doesn't own you"

24.07 / 20:31

Action UPS I cannot get enough of the swarming tyranids in Space Marine 2 videos

24.07 / 20:30

Simulation UPS The strangest game I've played this year can confuse and horrify you, too, in September

24.07 / 20:27

UPS The worst part of the new Apex Legends battle pass plan is being reversed

24.07 / 20:27

Tessa Thompson to Headline Netflix’s Thriller His & Hers

24.07 / 20:23

Fighting Wii UPS Booking Nintendo Modders made a tiny Nintendo Wii that doubles as a keychain

24.07 / 20:22

Rebel Ridge Release Date Set for Netflix’s Aaron Pierre Thriller

24.07 / 20:22

UPS Summer Board Game Buying Guide – Updated July 24, 2024

24.07 / 20:22

500+ World of Warcraft developers unionise

24.07 / 20:19

PlayStation 5 PC Concord won’t have a battle pass, free post-launch content teased

24.07 / 20:13

Respawn u-turns on battle pass changes following player backlash

24.07 / 20:08

Warcraft UPS Cooper World of Warcraft’s Entire Development Team Has Officially Unionized

24.07 / 20:00

Fighting Puzzle Adventure Progressive Arranger: A Role-Puzzling Adventure launches July 25, devs detail boss design

24.07 / 19:56

Platform UPS Twitch policy update clarifies exactly what it means by 'sexual harassment'

24.07 / 19:53

Provident PlayCap is a "pioneering" angel investment syndicate "led by women"

24.07 / 19:51

UPS Zenless Zone Zero: Best Piper Build (Skills, W-Engine, Teams & Drive Discs)

24.07 / 19:45

NVIDIA NVIDIA To Ship Out a Whopping $210 Billion Worth of Blackwell “GB200” AI Servers In 2025

24.07 / 19:34

Provident The next Secret Lair for Magic the Gathering stars Monty Python and the Holy Grail

24.07 / 19:20

Platform Racing UPS Confidential Console Closed Beta Details

24.07 / 19:18

Warcraft UPS Progressive War The War Within Pre-Expansion Content Update Notes—Now Live!

24.07 / 19:17

PC Sony PS5 Concord Will Have Single-Player Training Modes at Launch, No Free-For-All

24.07 / 19:14

boxing What Are Palworld Dog Coins & What Can They Be Used For?

24.07 / 19:12

Party Adventure Fallout Fallout: New Vegas Mod Adds A New, Fully-Voiced Companion With A Unique Backstory

24.07 / 19:09

Action UPS The First Descendant executes total U-turn on the promise of endless loot caves, blames 'mistranslation' and weirdly insists 'there have been no reversals'

24.07 / 19:09

UPS AMD Intel Arrow Lake-S “Core Ultra 9 285K” Desktop CPU Benchmarks Leak: Up To 18% Faster Than 14900K, Scores Over 43K In Cinebench At 250W

24.07 / 19:07

Discover Reddit People are making mini Fractal Design cases for their Raspberry Pi and it's making me wish I owned a 3D printer so I could do the same

24.07 / 19:06

Platform Provident football Is College Football 25 cross-platform?

24.07 / 19:04

Platform Apple boxing Mark Zuckerberg launches Meta's latest AI model, moans about Apple being mean and imagines a future of AI bots commenting on each other's social media posts

24.07 / 19:02

UPS Nintendo Nintendo summer sale has big discounts on NES controllers, Final Fantasy, and Resident Evil

24.07 / 19:00

UPS Hasbro boxing Monty Python is the next Magic: The Gathering Secret Lair crossover, in a trend that feels increasingly like Hasbro has a board full of IPs and a box full of darts

24.07 / 18:59

Platform Fighting UPS LEGO Nintendo The best free Xbox Series X games

24.07 / 18:57

UPS It's official: International Olympic Committee votes to confirm the first Olympic Esports Games will happen in 2025, even if they don't know the games yet

24.07 / 18:57

PlayStation 5 PC New PS5 beta to add custom 3D audio profiles, plus ‘adaptive controller charging’ for Slim model

AMD’s Instinct MI300X AI Throughput Performance & Latency Improved By 7x With GEMM Tuning

Nscale's Newest AMD MI300X Benchmarking Reveals That GEMM Tuning Has Brought In Significant Performance Bumps

Related Story AMD Ryzen 9000 “Zen 5” CPUs Listed Online At Much Lower Prices Than Ryzen 7000 MSRPs – 9950X €659, 9900X €499, 9700X €399, 9600X €309

What is GEMM Tuning?

Metrics Compared

Related News