AI Papers Reader

Personalized digests of latest AI research

View on GitHub

New AI Paradigm Allows LLM Agents to Continuously Evolve by Learning from Experience

Palo Alto, November 9, 2025 — Autonomous agents powered by Large Language Models (LLMs) have demonstrated incredible reasoning abilities, yet they suffer from a critical flaw: they are static. Once trained, they cannot continuously learn and grow from new experiences encountered during deployment, hindering their path toward true artificial intelligence.

Researchers from Tsinghua University and ByteDance Seed have introduced a novel learning paradigm called FLEX (Forward Learning with Experience) that overcomes this limitation. Instead of relying on computationally expensive, gradient-based parameter updates (backpropagation), FLEX enables LLM agents to continuously evolve by building and refining an “Experience Library” in a gradient-free, forward-learning manner.

Decoupling Knowledge from Parameters

FLEX fundamentally shifts the learning process from tweaking the LLM’s internal weights to distilling real-world interactions into structured, reusable knowledge. When an agent interacts with an environment, it generates problem-solving trajectories (both successes and failures). An auxiliary updater agent then distills these into textual semantics—such as strategic principles, procedural patterns, and common failure warnings—which are stored in a continuously evolving library.

During a new task, the LLM agent retrieves the most pertinent textual experience from this library to guide its reasoning. This process turns the “black box” nature of traditional LLM learning into an explicit, transparent knowledge system.

The empirical results across diverse, challenging scientific domains demonstrate the transformative power of this approach:

Mathematics: On the Olympiad-level AIME25 mathematical reasoning benchmark, FLEX provided dramatic improvements. For instance, the accuracy of the Claude-Sonnet-4 model skyrocketed from a baseline of 40.0% to an impressive 63.3%, a substantial relative gain achieved after learning from only 49 examples.

Chemistry and Biology: FLEX also proved highly effective in specialized fields where generalist LLMs often struggle. On the USPTO50k benchmark for chemical retrosynthesis, FLEX boosted the performance of Claude-Sonnet-4.5 from 20.0% to 30.0% accuracy. In biology, on the ProteinGym benchmark for fitness prediction, FLEX improved the Spearman correlation score for Claude-Sonnet-4 by nearly 14 absolute points, pushing its performance closer to specialized protein language models.

Inheriting Wisdom

A core finding of the research is the exceptional inheritance property of the FLEX experience library. Since knowledge is stored externally, decoupled from the LLM’s parameters, the library functions as a lightweight, “plug-and-play” module that can be instantly transferred between agents.

This allows for the distillation of expertise from a strong, expensive model to a weaker, cheaper one. For example, the experience library generated by the high-performing Claude-Sonnet-4.5 model on the USPTO50k chemistry task was able to boost the performance of the weaker Gemini-2.5-Pro model by an impressive 11 absolute points. This suggests a cost-effective path to enhance an entire fleet of less capable agents by sharing a single, collective knowledge module.

Furthermore, the researchers identified a clear scaling law demonstrating that agent performance scales predictably with the size and quality of the accumulated experience library.

By shifting the paradigm from parameter-based optimization to experience-driven evolution, FLEX offers a principled, low-cost framework for enabling AI agents to achieve continuous, scalable, and highly interpretable lifelong learning. This work represents a significant step toward developing AI systems capable of open-ended evolution and collective intelligence.