IKEA: A New AI Agent Learns When to Ask for Help, Reducing AI Hallucinations
Large Language Models (LLMs) are powerful, but prone to “hallucinations,” or generating incorrect information. Retrieval-Augmented Generation (RAG) attempts to fix this by allowing LLMs to search for external knowledge. However, current RAG systems often overuse external searches, even when the LLM already possesses the information internally, leading to inefficiencies and potential conflicts with internal knowledge.
Now, researchers from the Chinese Academy of Sciences and the Beijing Academy of Artificial Intelligence have introduced a new approach called Reinforced Internal-External Knowledge Synergistic REasoning Agent (IKEA). IKEA tackles this problem by teaching an LLM to discern its own knowledge boundaries, prioritizing its internal knowledge and only resorting to external search when necessary.
Imagine you are answering a trivia question. You might immediately know the capital of France (Paris) – that’s internal knowledge. But if asked about a niche historical event, you’d likely need to consult a search engine – external knowledge. IKEA aims to mimic this intelligent decision-making process in LLMs.
The IKEA system utilizes reinforcement learning (RL) to train the LLM. A key component is a “knowledge-boundary aware reward function.” This function rewards the LLM for providing correct answers while minimizing unnecessary searches. It encourages the LLM to leverage internal knowledge when sufficient, and conversely, prompts retrieval when internal knowledge is lacking. For instance, if IKEA answers a question about a well-known fact correctly without searching, it gets a high reward. If it gives a wrong answer and doesn’t search, it gets penalized. If it gives a wrong answer but then searches and corrects itself, it gets a smaller penalty.
The research team also created a “knowledge-boundary aware training dataset.” This dataset contains a balanced mix of questions: some that the LLM is likely to know internally, and others that require external search. This balanced training helps the LLM learn to adaptively leverage both internal and external knowledge.
The researchers tested IKEA on several knowledge-intensive reasoning tasks, finding that it significantly outperforms existing methods. IKEA delivered more accurate answers, reduced the number of unnecessary retrievals, and exhibited robust generalization capabilities on previously unseen data. For example, compared to “Search-R1” a naive reinforcement learning approach, IKEA significantly decreased the number of searches while improving performance.
A critical aspect of IKEA’s success lies in its ability to avoid knowledge conflicts, a common problem where erroneous retrieved knowledge overrides accurate internal knowledge. By better understanding its own knowledge limits, IKEA can judiciously use external information, minimizing the risk of introducing noise.
The research team notes that their approach has limitations. One limitation lies in the potential inability to apply the data and model probing techniques to all scenarios. Also, the effectiveness of the reward system hinges on meticulously refining its parameters through an often computationally intensive method. Lastly, further refinements are needed to reduce training resource constraints.
Despite the limitations, IKEA presents a significant step towards more efficient and reliable retrieval-augmented LLMs, potentially reducing hallucinations and improving overall performance.
Chat about this paper
To chat about this paper, you'll need a free Gemini API key from Google AI Studio.
Your API key will be stored securely in your browser's local storage.