For an industry built on predicting the future, artificial intelligence has proven remarkably bad at predicting its own limitations. In November 2024, OpenAI—the company that launched the AI boom with ChatGPT—discovered that its next-generation model Orion showed only modest improvements over its predecessor. As reported by The Information, the new model’s “increase in quality was far smaller compared with the jump between GPT-3 and GPT-4,” despite massive additional investment in computing power and training data.
This shouldn’t surprise anyone who has been paying attention to the gap between AI’s promises and its performance. The entire field has been operating on what amounts to a collective faith: that if you make AI models bigger and feed them more data, they will inevitably become more capable. OpenAI CEO Sam Altman exemplified this belief when he promised that his company’s next model would improve over GPT-4 by the same dramatic margin that GPT-4 had improved over GPT-3.
The improvements in the new model primarily center around marginally better reasoning capabilities and reduced hallucinations—cases where the AI model generates plausible-sounding but factually incorrect information with high confidence. While the new model, Orion, shows enhanced abilities in tasks like mathematical reasoning and coding, these gains are incremental rather than revolutionary. The system still operates on the same principle of predicting likely responses based on training data, just with more parameters and better fine-tuning. The pattern matches what venture capitalist Ben Horowitz recently observed about the latest generation of AI models: despite increasing computing power, “we’re not getting the intelligent improvements at all.”
As we assess AI’s current capabilities and limitations, we can now step back and evaluate some of the industry’s underlying assumptions more objectively. The assumption that progress in artificial intelligence follows predictable “scaling laws” appears to be less a fundamental principle than a temporary phenomenon—one that may have captured a brief period of rapid advancement rather than an eternal truth. This realization raises important questions about the foundations of modern AI, with its hundred-billion-dollar valuations and ambitious promises of artificial general intelligence (AGI). Companies that have based their business models and valuations on continued exponential improvements may need to substantially revise their expectations and adapt their strategies as the limitations of current approaches become clearer.
Large Language Models (LLMs) like GPT-4 are pattern-matching machines that predict what words should come next in a sequence, based on statistical correlations gleaned from massive amounts of human-written text. When they appear to engage in conversation or write essays, they’re executing this same basic operation over and over—analyzing patterns in their training data to generate statistically likely responses. All of this is very far from artificial general intelligence—roughly the difference between a “magic trick” and genuine “magic,” with one highly unlikely to produce the other.
The industry’s response to hitting this wall in its technological improvement has been telling. The evidence that scaling has reached its limits is mounting: Bigger models no longer yield proportionally bigger advances. The training runs for these massive models cost tens of millions of dollars, require hundreds of chips running simultaneously, and often face hardware failures during months-long processes. More fundamentally, these models have begun to exhaust the world’s easily accessible training data.
The very need for new approaches reveals the extent of the problem: OpenAI’s most recent model, OpenAI o1 (Pro Version just released on December 5th), relies on “test-time compute”—a technique that improves performance not by making the model bigger, but by giving it multiple attempts to work through problems before delivering an answer. While the Pro version shows improved reliability on certain benchmarks, particularly in mathematics and coding, these gains come from optimization rather than fundamental advances in the model’s capabilities. Google’s eye-catching Gemini 2.0—an introduction to the “agentic era” of AI—would similarly be considered an instance of a fancy tweak. Somewhat ironically, there has arisen a faith in tweaking as a new kind of scaling—a belief that an endless series of clever optimizations can somehow deliver the exponential progress that bigger models failed to achieve.
This sort of cognitive dissonance extends beyond individual companies to encompass entire market structures. Anthropic’s CEO Dario Amodei recently predicted AI model training costs could reach $100 billion next year. Such astronomical figures suggest not just technical ambition but a kind of institutional momentum—the belief that massive investment will overcome current limitations. And that says more about industry mindset than it does about AI technology.
This moment of technological plateau reveals something deeper about Silicon Valley’s influence on how we collectively think about progress. The industry has skillfully promoted a narrative in which every technological limitation is temporary, every problem solvable with sufficient computing power, and every critique dismissible as a failure of imagination. This narrative has proved remarkably effective at attracting investment—OpenAI’s $86+ billion valuation being just one example—but leaves something to be desired in producing real advances and profitable commercial products.
The broader lesson here isn’t that artificial intelligence is worthless—the technology clearly has valuable applications in drug discovery, weather forecasting, and scientific research, even in its current form. Rather, it’s that we need to fundamentally reassess how we evaluate technological promises. The AI industry has benefited from a perfect storm of factors that enabled overpromising: technical complexity that discouraged detailed scrutiny, financial incentives that rewarded hype, and a media environment that initially amplified rather than investigated extraordinary claims.
This misalignment between public perception and technical reality has real consequences. Many people interact with AI tools believing they’re engaging with something approaching human-level intelligence, when in fact they’re using a sophisticated pattern-matching system. It’s easy for people to mistakenly over-place their trust in these “AI”-type models. In a 2023 DeepMind paper, researchers proposed six levels of AGI, ranked by the proportion of skilled adults that a model can outperform. Current AI technology has reached only the lowest level. The gap between current capabilities and higher levels of performance suggests that fundamental advances, not just technical optimizations, may be needed.
Barring dramatic advances, the AI industry faces a moment of truth. As the gap between promises and reality becomes harder to ignore, companies will need to choose between maintaining increasingly implausible narratives about exponential progress and acknowledging the more modest but still valuable role their technology might actually play in society. That choice will reveal whether Silicon Valley has learned anything from its history of boom-and-bust cycles, or whether we’re doomed to repeat this pattern of technological overreach and disillusionment.
The irony is that by continuing to recklessly promise artificial general intelligence, the AI industry risks overshadowing the remarkable tools it has actually created. A more honest assessment of both capabilities and limitations might have led to more sustainable development and more useful applications. Instead, we’re left with impressive statistical models wrapped in promises of artificial general intelligence—but the nature of the current architecture makes fulfilling such promises unlikely.
Nick Potkalitsky writes about artificial intelligence and education on his Substack Educating AI. An AI researcher and educator with a Ph.D. in new media, narrative, and rhetoric, he is the co-author, with Mike Kentz, of AI in Education: A Roadmap to a Teacher-Led Transformation.
Follow Persuasion on X, LinkedIn, and YouTube to keep up with our latest articles, podcasts, and events, as well as updates from excellent writers across our network.
And, to receive pieces like this in your inbox and support our work, subscribe below: