AI Papers Reader

Personalized digests of latest AI research

View on GitHub

OutfitAnyone: A new AI model lets anyone try on any outfit

Virtual try-on (VTON) technology is changing how we shop for clothes. It allows users to see how a garment would look on them without ever having to physically try it on. But current VTON methods often struggle with generating realistic and detailed results, especially when dealing with different poses, body shapes, or clothing styles.

A new AI model called OutfitAnyone, developed by researchers at Alibaba and USTC Formation.ai, aims to overcome these limitations. OutfitAnyone uses a two-stream conditional diffusion model to generate high-quality virtual try-ons that are more realistic than ever before.

Here’s how OutfitAnyone works:

  1. Dual-Path Conditional Diffusion Model: The model takes in two inputs: a picture of the person and a picture of the clothing item. It then processes these inputs through two separate pathways, one for each input. These pathways merge later in the model, where the garment details are integrated into the model’s feature representation. This allows the model to accurately capture the clothing’s textures and patterns and ensure they are realistically applied to the person.

  2. Classifier-Free Guidance: OutfitAnyone uses classifier-free guidance to control the generation process. This technique allows the model to generate images that align with the input clothing image without relying on an external classifier. This helps ensure that the generated images accurately reflect the clothing style and details.

  3. Background and Lighting Retention: OutfitAnyone uses a technique to preserve the original image’s background and lighting, ensuring that the generated try-on image appears consistent with the original. This is done by erasing the clothing area from the original image and using the generated clothing image to fill in the missing details.

  4. Pose and Shape Guider: OutfitAnyone incorporates a pose and shape guider to ensure that the generated try-on image accurately reflects the person’s body shape. This is done by providing the model with additional information about the person’s pose and shape, such as densepose data or SMPL information.

  5. Detail Refiner: OutfitAnyone uses a detail refiner to enhance the realism of the generated try-on images. This is done by taking the coarse output of the main model and further enhancing it through a subsequent refinement process.

OutfitAnyone achieves impressive results. It can accurately generate virtual try-ons for a wide range of clothing styles, body shapes, and poses. The model can handle different backgrounds and lighting conditions, and it can even create try-ons for animated characters.

The research paper shows examples of OutfitAnyone’s capabilities, such as generating try-ons for various clothing styles, including dresses, jackets, and shirts. It also demonstrates the model’s ability to accurately reflect different body shapes, from children to adults.

OutfitAnyone is a significant advancement in VTON technology. The model’s ability to handle complex scenarios and generate highly realistic results makes it a valuable tool for online retailers, fashion designers, and anyone looking for a more personalized shopping experience.