Closing the Last Mile of Code Generation with MGDebugger

🔊

💬 Ask

Large language models (LLMs) have made significant strides in code generation, but the generated code often contains subtle errors that require human intervention to fix. Existing LLM-based debugging systems treat programs as monolithic units, failing to address bugs at various levels of granularity.

A research paper published in the arXiv preprint repository introduces MGDebugger, a hierarchical code debugger that decomposes problematic code into a tree structure of subfunctions. Each level represents a specific granularity of error, ranging from low-level syntax errors to high-level algorithmic flaws. During debugging, MGDebugger analyzes each subfunction and iteratively resolves bugs in a bottom-up manner.

For example, imagine you have a code snippet that aims to find the focus of a parabola. The code may contain a simple error in the formula for calculating the y-coordinate of the focus. MGDebugger would first decompose the code into subfunctions, one for calculating the x-coordinate and one for calculating the y-coordinate. Then, it would analyze each subfunction separately, identifying the error in the y-coordinate calculation and providing a fix.

The paper demonstrates that MGDebugger significantly outperforms existing debugging methods, achieving an 18.9% improvement in accuracy over seed generations in HumanEval and a 97.6% repair success rate in HumanEval-Fix. This improvement is attributed to the effectiveness of MGDebugger’s hierarchical debugging strategy, which allows it to address bugs across different categories and difficulty levels.

This research provides a promising solution to the “last mile” challenge in code generation, paving the way for more reliable and robust AI-assisted coding tools. By tackling the issue of subtle errors in a systematic and hierarchical manner, MGDebugger helps to bridge the gap between LLM-generated code and human-level code quality.

AI Papers Reader

Personalized digests of latest AI research

Closing the Last Mile of Code Generation with MGDebugger

Chat about this paper