Search Articles

Overview #

Image models are very computationally expensive
Training these often takes 100s of GPU days
Evaluating a model is also expensive in time and memory
The key to this paper was to reduce computational demands while maintaining performance of diffusion models
It is critical to adjust the noise schedule to add more noise when training higher resolution image models

Training #

Training is broadly categorized into two stages
- Perceptual Compression - removes high frequency details but still learns some semantic variation
- Semantic Compression - The model learns the semantic composition of the data