Episode 529

Enhance-A-Video: Better Generated Video for Free

February 13, 202520m 31s

Audio is streamed directly from the publisher (media.transistor.fm) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

🤗 Upvotes: 14 | cs.CV

Authors:
Yang Luo, Xuanlei Zhao, Mengzhao Chen, Kaipeng Zhang, Wenqi Shao, Kai Wang, Zhangyang Wang, Yang You

Title:
Enhance-A-Video: Better Generated Video for Free

Arxiv:
http://arxiv.org/abs/2502.07508v1

Abstract:
DiT-based video generation has achieved remarkable results, but research into enhancing existing models remains relatively unexplored. In this work, we introduce a training-free approach to enhance the coherence and quality of DiT-based generated videos, named Enhance-A-Video. The core idea is enhancing the cross-frame correlations based on non-diagonal temporal attention distributions. Thanks to its simple design, our approach can be easily applied to most DiT-based video generation frameworks without any retraining or fine-tuning. Across various DiT-based video generation models, our approach demonstrates promising improvements in both temporal consistency and visual quality. We hope this research can inspire future explorations in video generation enhancement.

← All episodes of Daily Paper Cast