The AI Architect: How 'AgentConductor' Builds the Perfect Coding Team on the Fly
When a professional software team tackles a project, they don’t use the same workflow for every task. A simple bug fix might only require a single developer and a quick review. However, building a complex high-frequency trading algorithm requires a coordinated dance between architects, security experts, and testers.
Until now, artificial intelligence has struggled with this kind of organizational flexibility. Most multi-agent systems—where several Large Language Models (LLMs) collaborate—rely on “fixed topologies.” This means they use the same rigid communication map regardless of whether they are solving a middle-school math problem or a Google-level coding challenge. This leads to a “too many cooks in the kitchen” problem for easy tasks (wasting money and time) and a “not enough brainpower” problem for hard ones.
A new paper from researchers at Shanghai Jiao Tong University and Meituan introduces AgentConductor, a system designed to solve this by acting as a dynamic AI architect. Instead of following a pre-set script, AgentConductor builds and evolves a custom “team structure” for every specific problem it encounters.
Scaling the Team to the Task
The core of the system is an “orchestrator” agent. When a user submits a coding prompt, the orchestrator first assesses the difficulty. It then generates a communication plan in a human-readable format called YAML, which dictates which specialized agents—such as a “Planner,” “Coder,” or “Debugger”—should talk to each other and in what order.
To understand the intuition, imagine two scenarios:
- The Easy Task: For a simple request to “reverse a string,” AgentConductor might generate a “lean” team. It activates a single Coder and a Tester. The communication is a straight line.
- The Competition-Level Task: For a complex algorithmic challenge, the orchestrator builds a “dense” team. It might launch a Searcher to find relevant libraries, a Planner to outline the logic, and multiple Coders working in parallel, all feeding their work into a Debugger and a final Tester.
Learning from Failure
What makes AgentConductor truly unique is its ability to “evolve” the team mid-task. If the first attempt at code fails a test, the orchestrator doesn’t just try again; it looks at the specific error message and rewrites the team’s communication graph. If the code failed because of a “Time Limit Exceeded” error, it might bring in an “Algorithmer” to optimize the logic for the next round.
This behavior was refined using Reinforcement Learning (RL). The researchers trained the orchestrator using a method called Group Relative Policy Optimization (GRPO). Essentially, the AI was rewarded not just for getting the code right, but for doing so with the simplest possible team. This created a “density-aware” system that avoids unnecessary “chatter” between AI agents, which is often a major source of errors and high costs.
Breakthrough Results
The results are striking. Across three major competition-level coding datasets, AgentConductor outperformed the previous best systems by up to 14.6% in accuracy. Even more impressively, it did so while being incredibly efficient. Because it prunes away unnecessary agents for easier tasks, it reduced token costs by a staggering 68% compared to traditional fixed-graph systems.
As AI continues to move toward more complex, multi-step reasoning, AgentConductor suggests that the secret to better performance isn’t just bigger models—it’s smarter management. By learning how to organize itself, AI is finally beginning to work like a real engineering team.
Chat about this paper
To chat about this paper, you'll need a free Gemini API key from Google AI Studio.
Your API key will be stored securely in your browser's local storage.