New Framework Aims to Improve How Language Models Follow Instructions

🔊

💬 Ask

A team of researchers from Fudan University, Lenovo Research, and Tencent have developed a new framework for evaluating and improving how well large language models (LLMs) follow instructions. This framework addresses limitations in existing benchmarks that often rely on simple, templated instructions, failing to capture the diversity and complexity of real-world user requests.

The core of the new framework is a multi-dimensional approach to defining constraints, the specific requirements that an LLM’s output must adhere to. It categorizes these constraints along three key dimensions:

Pattern: How the constraint is presented (e.g., through examples, lists, or integrated directly into the instruction).
Category: The type of constraint, such as content (keywords, punctuation), format (XML, JSON), language, or length.
Difficulty: The number and complexity of constraints in a single instruction, ranging from a single constraint (Level I) to multiple interacting constraints (Level IV).

To illustrate, consider the task of asking an LLM: “What is the largest animal?”. This seemingly simple instruction can be augmented with constraints:

Level I: “What is the largest animal? The answer must be in capitalized form, with the first letter of each word capitalized.”
Level II: (Building on Level I) “The answer must include the keyword ‘largest’. The answer must end with a period.”
Level III: (Building on Level II) “If the answer includes a table, it must be presented with a maximum of 3 rows.”
Level IV: (Building on Level III) “The answer must be between 10 and 20 words.”

The framework also includes an automated instruction generation pipeline. This pipeline takes raw instructions and transforms them into constraint-rich prompts. It does this through constraint expansion (adding new constraints), conflict detection (ensuring constraints don’t contradict each other), and instruction rewriting (adapting the instruction to different presentation patterns).

Using this pipeline, the researchers generated 1,200 diverse instruction-following test samples and evaluated 19 LLMs from 7 families. They found significant variations in performance across constraint forms, with models struggling with “listing” and “incorporation” patterns and difficulty levels. Specifically, the average performance of the models dropped from 77.67% on Level I instructions to only 32.96% on Level IV instructions.

The researchers then used their framework to create training data for reinforcement learning, significantly improving instruction-following ability without sacrificing general performance. Analysis revealed that these improvements were linked to modifications in the model’s attention mechanisms, enabling better recognition of, and adherence to, constraints. The work demonstrates a promising path toward more reliable and controllable LLMs.

AI Papers Reader

Personalized digests of latest AI research

New Framework Aims to Improve How Language Models Follow Instructions

Chat about this paper