AI Papers Reader

Personalized digests of latest AI research

View on GitHub

LEGION: AI Model Learns to Spot and Explain Synthetic Image Flaws, Guides Image Refinement

Generative AI has made it easier than ever to create realistic images, but these advancements come with the risk of misinformation and misuse. Now, researchers are developing tools to combat this threat. A new study introduces LEGION, a multimodal large language model (MLLM) designed to detect and explain artifacts in fully synthetic images. The team also introduced SynthScars, a dataset of 12,236 synthetic images annotated by experts that LEGION leverages for training.

Unlike existing methods that often focus on detecting manipulated images or rely on outdated generative techniques, LEGION aims for a deeper understanding of synthetic image generation. LEGION integrates artifact detection, segmentation, and explanation into a single framework, providing a powerful tool for identifying subtle inconsistencies that betray a synthetic origin.

“Current methods often lack artifact-level textual interpretability,” explains Weijia Li, one of the study’s authors. “They’re overly focused on image manipulation detection, whereas we wanted to create a system that could not only identify the flaws, but also explain why the image is synthetic.”

Here’s how it works. When fed an image, LEGION can highlight specific regions containing artifacts and provide detailed textual explanations. For example, LEGION might identify a distorted shadow in a digitally generated living room, explaining that the shadow’s angle is inconsistent with the apparent light source in the scene, indicating a violation of physical laws. The explanation also includes segmentation masks that highlight the location of the shadow in question.

The researchers go a step further by exploring LEGION’s potential as a controller for refining image generation. Instead of just acting as a “defender” identifying synthetic images, LEGION can guide image generators to produce more realistic outputs. They introduce two iterative optimization pipelines:

  1. Image Regeneration: LEGION’s artifact explanations iteratively refine the prompt used to generate the image. For example, if LEGION detects a cartoonish style in an initial rendering, the prompt could be revised to include constraints like “natural lighting” and “realistic style,” leading to a more photorealistic image.

  2. Image Inpainting: This process selectively refines areas containing artifacts. If LEGION detects a distorted finger in a generated image, it can guide an inpainting model to correct the specific region, progressively reducing the artifact’s visibility and enhancing the image’s overall authenticity.

Experiments showed LEGION outperforming existing methods across multiple benchmarks, including a significant improvement over the second-best traditional expert on the SynthScars dataset. Moreover, images refined under LEGION’s guidance exhibited stronger alignment with human preferences. “By integrating artifact detection with image generation, we can foster controlled advancement in this field,” says Conghui He, another author on the study.

The researchers will release the code, model, and dataset, offering the broader AI community a new tool to understand, detect, and improve synthetic image generation. This work represents a step towards a more transparent and responsible future for generative AI technologies.