Beyond the Brain: Why the Future of AI Lies in Its Scaffolding
For years, the race for artificial intelligence has been a battle of “bigger is better.” Tech giants have competed to build models with more parameters and larger “brains,” assuming that raw processing power would eventually lead to truly autonomous agents. But a landmark new review paper from researchers at institutions including Shanghai Jiao Tong University and Carnegie Mellon University suggests we have been looking in the wrong place.
The paper, titled “Externalization in LLM Agents,” argues that the next leap in AI won’t come from changing the model’s internal “weights,” but from building better infrastructure around them. They call this shift externalization: the process of moving cognitive burdens out of the AI’s internal circuitry and into external tools, memory stores, and protocols.
The Shopping List Theory of Intelligence
To understand this, the authors point to human history and the concept of “cognitive artifacts.” As psychologist Donald Norman famously noted, a shopping list doesn’t make your biological memory larger; it changes the task from a difficult act of recall to a simple act of recognition.
“Cognitive artifacts do not change human capabilities,” the authors quote. “They change the task.”
The researchers argue that Large Language Model (LLM) agents are undergoing a similar evolution. Instead of forcing an AI to remember everything perfectly or figure out complex workflows from scratch every time, engineers are building a “Harness”—a sophisticated environment that supports the AI.
Three Pillars of External Intelligence
The paper identifies three specific dimensions where AI is moving its “thinking” outside the box:
- Memory (Externalized State): Rather than relying on a “context window” (the AI’s short-term memory), agents now use external databases.
- Intuition: Imagine a legal assistant AI. Without externalization, it must try to “remember” 50 prior cases within its immediate view. With externalized memory, it simply “looks up” the relevant case law in a library, turning a test of memory into a test of research.
- Skills (Externalized Expertise): Instead of an AI “improvising” a solution to a problem, it uses a library of pre-validated “skills.”
- Intuition: A software engineering agent asked to “fix a bug” might have a skill file called
DEBUG_WORKFLOW.md. This file contains the exact steps: run the test, check the logs, and propose a patch. The AI isn’t guessing how to be an engineer; it is following an expert playbook.
- Intuition: A software engineering agent asked to “fix a bug” might have a skill file called
- Protocols (Externalized Interaction): This moves the “grammar” of interaction into standardized contracts.
- Intuition: If an AI wants to use a calculator, it shouldn’t have to guess how to talk to it. Protocols, like the Model Context Protocol (MCP), act like a universal USB port, allowing the AI to “plug in” to any tool instantly and reliably.
The Rise of “Harness Engineering”
The hero of this new era is the Harness. This is the orchestration layer that coordinates memory, skills, and protocols. The authors argue that a “smart” agent is actually a modest model inside a brilliant harness. This harness manages permissions, observes the AI for errors, and provides a “sandbox” where the AI can work safely.
This shift has profound implications. It means we can build highly reliable, specialized agents without needing the world’s largest supercomputers. By externalizing the “hard parts” of cognition, we make AI more auditable, more steerable, and—most importantly—more capable of solving the long-horizon tasks that currently baffle standalone models.
As the authors conclude, the goal is no longer just to build a better reasoner, but to build a better “organized cognitive system.” The future of AI isn’t just a bigger brain; it’s a better-equipped workstation.
Chat about this paper
To chat about this paper, you'll need a free Gemini API key from Google AI Studio.
Your API key will be stored securely in your browser's local storage.