Ilya Sutskever, co-founder of OpenAI, thinks existing approaches to scaling up large language models have plateaued. For significant future progress, AI labs will need to train smarter, not just bigger, and LLMs will need to think a little bit longer.
Speaking to Reuters, Sutskever explained that the pre-training phase of scaling up large language models, such as ChatGPT, is reaching its limits. Pre-training is the initial phase that processes huge quantities of uncategorized data to build language patterns and structures within the model.
Until recently, adding scale, in other words increasing the amount of data available for training, was enough to produce a more powerful and capable model. But that's not the case any longer, instead exactly what you train the model on and how is more important.
“The 2010s were the age of scaling, now we're back in the age of wonder and discovery once again. Everyone is looking for the next thing,” Sutskever reckons, «scaling the right thing matters more now than ever.”
The backdrop here is the increasingly apparent problems AI labs are having making major advances on models in and around the power and performance of ChatGPT 4.0.
The short version of this narrative is that everyone now has access to the same or at least similar easily accessible training data through various online sources. It's no longer possible to get an edge simply by throwing more raw data at the problem. So, in very simple terms, training smarter not just bigger is what will now give AI outfits an edge.
Another enabler for LLM performance will be at the other end of the process when the models are fully trained and accessed by users, the stage known as inferencing.
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
Here, the idea is to use a multi-step approach to solving problems and queries in which the model can feed back into itself, leading to more human-like reasoning and decision-making.
“It turned out
Read more on pcgamer.com