PLAY PODCASTS
Continuous Speculative Decoding for Autoregressive Image Generation
Episode 108

Continuous Speculative Decoding for Autoregressive Image Generation

Daily Paper Cast

November 21, 202422m 37s

Audio is streamed directly from the publisher (media.transistor.fm) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

🤗 Paper Upvotes: 13 | cs.CV

Authors:
Zili Wang, Robert Zhang, Kun Ding, Qi Yang, Fei Li, Shiming Xiang

Title:
Continuous Speculative Decoding for Autoregressive Image Generation

Arxiv:
http://arxiv.org/abs/2411.11925v1

Abstract:
Continuous-valued Autoregressive (AR) image generation models have demonstrated notable superiority over their discrete-token counterparts, showcasing considerable reconstruction quality and higher generation fidelity. However, the computational demands of the autoregressive framework result in significant inference overhead. While speculative decoding has proven effective in accelerating Large Language Models (LLMs), their adaptation to continuous-valued visual autoregressive models remains unexplored. This work generalizes the speculative decoding algorithm from discrete tokens to continuous space. By analyzing the intrinsic properties of output distribution, we establish a tailored acceptance criterion for the diffusion distributions prevalent in such models. To overcome the inconsistency that occurred in speculative decoding output distributions, we introduce denoising trajectory alignment and token pre-filling methods. Additionally, we identify the hard-to-sample distribution in the rejection phase. To mitigate this issue, we propose a meticulous acceptance-rejection sampling method with a proper upper bound, thereby circumventing complex integration. Experimental results show that our continuous speculative decoding achieves a remarkable $2.33\times$ speed-up on off-the-shelf models while maintaining the output distribution. Codes will be available at https://github.com/MarkXCloud/CSpD