Beyond "It’s a Bird": Teaching AI to Stop Playing It Safe

🔊

💬 Ask

If you show a photo of a rare Golden-winged Warbler to a state-of-the-art AI, there is a good chance the model will simply tell you it is a “bird.” While technically correct, this generic answer is often useless for specialists or anyone seeking detailed information. This tendency to “play it safe” has long plagued Large Multimodal Models (LMMs) in open-world settings, where they aren’t limited to a pre-defined list of choices.

New research from the University of Trento and Fondazione Bruno Kessler introduces a solution called SpeciaRL, a reinforcement learning framework designed to nudge AI models toward more specific, fine-grained classifications without increasing their error rates.

The Problem: Knowledge Without Ambition

The researchers discovered a curious paradox: current AI models actually possess the “domain knowledge” required to be specific. When the researchers forced a model like Qwen2.5-VL to try 64 times to identify a single image, it would eventually hit the correct specific name. However, in its default state, the model usually defaults to a generic category in its very first attempt.

The paper identifies six levels of “specificity” that an AI can provide:

Wrong: Calling a Samoyed a “cat.”
Abstain: Saying “I don’t know.”
Generic: Calling a Samoyed a “dog.”
Less Specific: Calling a Samoyed a “working dog.”
Specific: Correctly identifying it as a “Samoyed.”
More Specific: Identifying a specific subtype (rare in general datasets).

Existing training methods often backfire. If you simply “prompt” an AI to be specific, it starts guessing wildly, leading to more incorrect answers. It’s like a student who, when told “be specific or else,” begins making up facts to avoid being vague.

SpeciaRL: A Dynamic Reward System

To solve this, the team developed SpeciaRL. This framework uses a “judge” model—a powerful Large Language Model—to evaluate the AI’s guesses during training.

The breakthrough lies in what the researchers call a “dynamic, sample-wise reward.” Instead of having a fixed target for every image, SpeciaRL looks at the model’s best possible performance for that specific image across multiple attempts (rollouts). If the model is capable of identifying a “Cessna 172RG,” SpeciaRL will only reward it for being that specific. However, if the image is so blurry that the model can only ever reliably identify “Aircraft,” it won’t be punished for sticking to that broader category.

This “anchored” reward system encourages the model to reach the ceiling of its own capabilities without pushing it over the edge into hallucination or guessing.

Concrete Results: From Birds to Boeings

The researchers tested SpeciaRL on diverse datasets, including flowers, food, and high-precision categories like StanfordCars and FGVCAircraft.

In one example, the base model looked at a photo of a specific cat and identified it as “Cat.” After training with SpeciaRL, the model correctly identified it as a “Birman.” Similarly, where the base model saw an “Aircraft,” SpeciaRL correctly deduced it was a “Boeing 737” by looking at the livery and engine configuration during its reasoning process.

Crucially, the model was trained primarily on a dataset of birds but showed improved specificity across entirely different domains, such as cars and planes. This suggests that SpeciaRL isn’t just teaching the model new names; it is teaching the model a better behavior—how to use the evidence it sees to reach the most precise conclusion possible.

By striking this delicate balance between being right and being specific, SpeciaRL moves us closer to AI assistants that can distinguish between a generic “salad” and a “Greek Salad” with the confidence of an expert.

AI Papers Reader

Personalized digests of latest AI research

Beyond "It’s a Bird": Teaching AI to Stop Playing It Safe

The Problem: Knowledge Without Ambition

SpeciaRL: A Dynamic Reward System

Concrete Results: From Birds to Boeings

Chat about this paper