AI Papers Reader

Personalized digests of latest AI research

View on GitHub

Inside the Mind of an AI: Researchers Map the Neural Circuits Behind Logical Reasoning

Large Language Models (LLMs) can write sonnets, debug code, and pass bar exams, but how they actually “think” remains a profound mystery. When an AI solves a complex riddle, is it genuinely reasoning, or just mimicking statistical patterns?

Now, researchers at the Japan Advanced Institute of Science and Technology (JAIST) have peeked under the hood of these digital minds. In a new paper, they have mapped the specific, specialized “circuits”—accounting for just 3% of the model’s total attention heads—that act as the engine for logical deduction.

To understand what they found, imagine solving a simple puzzle:

  • Fact 1: You have a key.
  • Rule 1: If you have a key, you can open the box.
  • Rule 2: If you open the box, you will find a gold coin.
  • Question: Can you find a gold coin?

To answer “yes,” your brain must perform “graph traversal”—mapping a path from the starting fact (the key) through a chain of rules to the final target (the coin). Under the hood, an LLM solves this using “Chain-of-Thought” prompting, writing out its reasoning step-by-step.

The researchers discovered that while the AI writes transition words like “and” or “then” with absolute certainty, it hesitates at critical decision points. These “uncertain tokens” occur precisely when the model has to make logical choices: selecting the correct premise (e.g., choosing “key” over irrelevant facts), deciding which rule to apply, and knowing when to stop searching.

Using a technique called Causal Mediation Analysis, the team traced the flow of information during these moments of hesitation. They systematically corrupted parts of the prompts—like changing “key” to “stone”—and measured which of the model’s “attention heads” (the computational units that weigh different words) reacted.

The results revealed a highly organized, assembly-line structure within the neural network:

  1. The Information Gatherers: In the early-to-middle layers of the model, specialized “reading heads” extract facts and rule definitions from the prompt.
  2. The Middle Managers: In the middle layers, heads validate whether rule conditions are met.
  3. The Executives: In the highest layers, decision-making heads integrate this information, applying abstract traversal strategies (like searching step-by-step) to select the correct next step.

To prove these circuits are truly responsible for logic, the researchers conducted a “knockout” experiment. They turned off just 3% of the model’s attention heads—specifically the ones identified as the deductive circuit.

The results were dramatic. On logic-heavy benchmarks like ProntoQA and ProofWriter, the models’ reasoning abilities collapsed to near zero, reducing them to random guessing. Crucially, their general knowledge performance (tested on the MMLU benchmark) remained virtually unharmed.

This modularity suggests that logical reasoning is not a fuzzy, model-wide phenomenon, but the job of a specialized, sparse set of tools inside the AI. Identifying these circuits is a major milestone toward building “explainable AI,” paving the way for models that are not only smarter but whose logic we can inspect, repair, and trust.