The honest answer: we don't know. One camp says scaling adds patterns, never understanding. Another says emergence shows we don't grasp what scaling produces. A third is exploring entirely different architectures. What's certain is that current LLMs are not on an obvious path to general intelligence — and that, for business, the question may not matter (see if it quacks, it's a duck).
School 1 — "It can't" (LeCun, Marcus, Chollet)
LLMs are autocomplete at scale. They have no model of the world — no physical causality, no goals, no genuine grasp of consequences. Scaling adds more patterns; it doesn't add understanding. You can build a larger dictionary, but a dictionary still isn't a mind. The architectural objection: transformers predict the next token, hold no persistent state between conversations, can't learn from experience after training, and aren't grounded in the physical world. These may be structural limits, not engineering to-dos.
School 2 — "Maybe it can"
Emergence — capabilities appearing suddenly and unpredictably at scale — suggests we may not understand what scaling really produces. Chain-of-thought reasoning wasn't designed in; it emerged. Perhaps something resembling understanding could too. This isn't a claim that it will — it's an argument for humility about confident "impossible" verdicts.
School 3 — "It needs a different architecture"
Real research directions, worth naming:
| Direction | The idea |
|---|---|
| World models | AI that builds internal models of how the world works (LeCun's own bet), not just what words follow words |
| Neuromorphic computing | Chips mimicking biological neurons — spiking, timing, energy efficiency |
| Quantum computing | Fundamentally different computation — though nowhere near useful for AI yet |
| Fractal / recursive | Self-similar structures as nature uses them; whether AI should mirror this is open |
| Bio-digital hybrids | Lab-grown neurons wired to computational systems — early research |
The structural objection, plainly
Today's models have no persistent memory between sessions, no post-training learning, no embodied experience, no grounding in reality. The "it can't" camp says these aren't bugs to patch with compute — they're properties of the architecture. The strongest evidence on their side is ARC-AGI: near-zero on genuinely novel reasoning.
Why the question matters practically
Whether AI "really understands" decides exactly one thing: the boundary of what you can trust it with. Inside its training distribution, understanding is irrelevant — the output works. Outside it — genuinely novel, high-stakes or unfamiliar problems — the lack of real understanding is precisely where it fails. That's your trust line, and the subject of our opinion piece, if it quacks, it's a duck.
Keep reading: the honest AGI debate and the quick definitional guide, AI vs AGI.