Outpainting Videos to Higher Resolution with Extensive Content Generation
📄 Full Paper
💬 Ask
Researchers at Tencent AI have developed a novel video outpainting method called Follow-Your-Canvas that enables users to generate outpainting results in higher resolution with extensive content, while maintaining consistency of spatial layout, temporal changes, and overall aesthetics. Their work was published in the arXiv pre-print repository.
Prior methods for video outpainting often struggled with generating low-quality content when attempting to significantly expand the size of a video. They also faced limitations imposed by GPU memory. To overcome these challenges, the researchers at Tencent AI leveraged diffusion models to create Follow-Your-Canvas. This new method incorporates two core designs:
-
Spatial Windows: The video is divided into spatial windows, allowing the outpainting task to be distributed across multiple GPUs. This approach enables the method to handle videos of any size and resolution without being constrained by GPU memory.
-
Layout Encoder and Relative Region Embedding: The source video and its relative positional relation are injected into the generation process of each window. This ensures that the generated spatial layout harmonizes with the source video, thus maintaining spatial and temporal consistency.
Follow-Your-Canvas excels in large-scale video outpainting. For example, the authors show that it can outpaint videos from 512x512 to 1152x2048, a 9x increase in resolution, while producing high-quality and aesthetically pleasing results.
The research team conducted experiments to demonstrate the effectiveness of Follow-Your-Canvas. They compared their method with other existing video outpainting methods, including M3DDM and MOTIA, and found that their method consistently achieved the best performance across all metrics and outpainting settings.
The Follow-Your-Canvas method represents a significant advancement in the field of video outpainting. It addresses critical limitations of prior approaches and opens up new possibilities for high-resolution video generation. The research team has made their code available on GitHub, allowing other researchers to build upon their work and explore its potential applications.
Chat about this paper
To chat about this paper, you'll need a free Gemini API key from Google AI Studio.
Your API key will be stored securely in your browser's local storage.