As Ferris astutely observed, life moves pretty fast in chatbot land. So, forget that ancient news about DeepSeek's cheap and powerful LLM tanking Nvidia's share price. Because here comes another Chinese tech giant, Alibaba, with its own new AI model that surpasses the lot. Well, it does according to Alibaba.
Qwen 2.5-Max, for it is he (she? they? take your pick), was released today according to Reuters and with it some pretty bombastic claims.
«Qwen 2.5-Max outperforms...almost across the board GPT-4o, DeepSeek V3 and Llama-3.1-405B,» Alibaba says. Notably, that's DeepSeek V3, not DeepSeek R1, which is the updated model that helped wipe $600 billion from Nvidia's share price in a day.
Still, those are OpenAI and Meta's most advanced open-source models. So, if Alibaba's claims are true, Qwen 2.5-Max is no slouch.
Reuters notes that the Alibaba release plays into a wider price war operating in China for access to AI models. It was an earlier DeepSeek model that moved Alibaba to announce massive 97% price cuts for access to its AI models.
For now, it's unclear how resource intensive the Qwen 2.5-Max is or is not. The thing that really rattled the markets when it comes to DeepSeek R1 arguably isn't its outright performance, but rather claims that it was trained on just $6 million dollars' worth of slightly hobbled Nvidia H800 GPUs, a small fraction of the cost associated with the huge GPU arrays used by the likes of OpenAI and Meta to train their models.
It's also emerged that the full-precision DeepSeek R1 model can run on just $6,000 of PC hardware, that cost mostly being eaten up by lots of memory but without the need for a megabucks Nvidia GPU.
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
So, the fact alone that Alibaba has a competitive model isn't earth shattering news. But the questions of the hardware used and the costs involved are intriguing.
There may also be an extent to which this first species of
Read more on pcgamer.com