AI Papers Reader

Personalized digests of latest AI research

View on GitHub

Expressive Whole-Body 3D Gaussian Avatars from Short Videos

Recent advancements in computer vision have made it possible to create 3D human avatars from casually captured videos. However, most of these avatars only support body movements, and lack facial expressions and hand motions. In this paper, the researchers introduce ExAvatar, a novel expressive 3D human avatar that can be animated with a variety of facial expressions, hand gestures, and body poses, learned from just a short monocular video.

The key innovation of ExAvatar is a hybrid representation that combines a standard 3D mesh model (SMPL-X) with a 3D Gaussian splatting (3DGS) model. SMPL-X provides a detailed model of the human body’s structure and pose, while 3DGS represents the avatar’s appearance using a collection of 3D Gaussian distributions. These Gaussians are placed on the surface of the mesh, with each one corresponding to a vertex. This hybrid representation is key to ExAvatar’s ability to produce highly expressive avatars.

To overcome the limitations of limited pose and facial expression data in typical videos, ExAvatar employs a number of techniques:

The researchers evaluate ExAvatar on several challenging datasets, demonstrating superior results compared to previous methods. For instance, ExAvatar produces much more realistic and detailed representations of faces and hands than prior techniques, especially in novel poses.

Here are some concrete examples of how ExAvatar works:

ExAvatar is a promising step towards the creation of more expressive and realistic 3D human avatars. It has the potential to be used in a variety of applications, such as virtual reality, video games, and animation.