Alias-Free Latent Diffusion Models

1S-Lab, Nanyang Technological University
2Wangxuan Institute of Computer Technology, Peking University
CVPR 2025
arXiv Code

Motivation

We found the VAE and denoising network in LDM are not equivariant to fractional shifts. We propose an alias-free framework to improve the fractional shift equivariance of LDM. We demonstrate the effectiveness of our method in various applications, including video editing, frame interpolation, super-resolution and normal estimation.

Results

Shifting I2SB Super-resolution Input


Shifting YOSO Normal Estimation Input

Video Editing



Warping-equivariant Editing

Our alias-free model can produce consistent editing results for videos of smooth motion.



As discussed in the limitation section, our method sufferers similar issues to flow-base editing. For example, in the following video, the flow in background is hard to obtain implicitly, causing the flickering effects in the outputs.

Splatting and Interpolation

The flickering is due to the overlapping of summation splatting, which can be addressed by depth estimation and softmax splatting.

Ablation Study

Shifting VAE Latent


Shifting LDM Noisy Latent

BibTeX


    @inproceedings{zhou2025afldm,
      title={Alias-Free Latent Diffusion Models: Improving Fractional Shift Equivariance of Diffusion Latent Space},
      author={Zhou, Yifan and Xiao, Zeqi and Yang, Shuai and Pan, Xingang },
      booktitle = {CVPR},
      year = {2025},
    }