AI Papers Reader

Personalized digests of latest AI research

View on GitHub

SEEKER: Enhancing Exception Handling in Code with a LLM-Based Multi-Agent Approach

The ability of large language models (LLMs) to generate code has rapidly advanced, but a major challenge remains: ensuring the generated code is robust and reliable. Exception handling is crucial for this robustness, but many developers struggle to implement it correctly, leading to fragile code.

A new paper published at ICLR 2025 tackles this challenge by proposing a multi-agent framework called SEEKER, which leverages LLMs to enhance exception handling practices. The authors identify three key issues with existing approaches:

  1. Insensitive Detection of Fragile Code: LLMs often fail to detect code that is prone to exceptions.
  2. Inaccurate Capture of Exception Types: LLMs may misinterpret the specific exception types that should be handled.
  3. Distorted Handling Solutions: Even when exceptions are correctly identified, LLMs may generate code that handles them in a way that is not consistent with best practices.

To address these issues, SEEKER uses a multi-agent system inspired by expert developer strategies. The system consists of five agents:

  1. Planner: Segments the code into manageable units.
  2. Detector: Identifies potentially fragile code segments using static analysis and scenario/property matching.
  3. Predator: Retrieves relevant exception handling information from a comprehensive external knowledge base called Common Exception Enumeration (CEE).
  4. Ranker: Assigns grades to exceptions based on their likelihood and the suitability of the handling strategies.
  5. Handler: Generates optimized code that incorporates robust exception handling strategies.

To illustrate the effectiveness of SEEKER, the paper includes extensive experiments comparing it to traditional approaches like general prompting, Retrieval-Augmented Generation (RAG), and a state-of-the-art method called KPC (Knowledge-driven Prompt Chaining). Across a variety of metrics, SEEKER consistently outperforms these baselines, demonstrating the benefits of its multi-agent approach and its integration of expert knowledge.

For example, SEEKER achieves significantly higher coverage of sensitive code, a greater accuracy in identifying the correct exception types, and a much higher Code Review Score, indicating that the generated code is of higher quality and adheres to best practices.

SEEKER’s ability to enhance exception handling practices is particularly important as LLMs become more widely used in software development. By making it easier for developers to generate robust and reliable code, SEEKER can help to improve the overall quality of software and reduce the risk of errors.

The authors also highlight the potential of SEEKER for future research, exploring its application to more complex tasks like app requirements engineering and its ability to adapt to different programming languages. By incorporating external knowledge and employing a multi-agent approach, SEEKER represents a significant advance in the field of exception handling and points the way toward a future where LLMs can be more reliably used for code generation.