Adaptive Training Meets Progressive Scaling: Elevating Efficiency in Diffusion Models

stp2yDecember 31, 20240 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 20 Dec 2023 (v1), last revised 25 Dec 2024 (this version, v3)]

View a PDF of the paper titled Adaptive Training Meets Progressive Scaling: Elevating Efficiency in Diffusion Models, by Wenhao Li and 5 other authors

Abstract:Diffusion models have demonstrated remarkable efficacy in various generative tasks with the predictive prowess of denoising model. Currently, diffusion models employ a uniform denoising model across all timesteps. However, the inherent variations in data distributions at different timesteps lead to conflicts during training, constraining the potential of diffusion models. To address this challenge, we propose a novel two-stage divide-and-conquer training strategy termed TDC Training. It groups timesteps based on task similarity and difficulty, assigning highly customized denoising models to each group, thereby enhancing the performance of diffusion models. While two-stage training avoids the need to train each model separately, the total training cost is even lower than training a single unified denoising model. Additionally, we introduce Proxy-based Pruning to further customize the denoising models. This method transforms the pruning problem of diffusion models into a multi-round decision-making problem, enabling precise pruning of diffusion models. Our experiments validate the effectiveness of TDC Training, demonstrating improvements in FID of 1.5 on ImageNet64 compared to original IDDPM, while saving about 20% of computational resources.

Submission history

From: Wenhao Li [view email]
[v1]
Wed, 20 Dec 2023 03:32:58 UTC (838 KB)
[v2]
Tue, 2 Jan 2024 02:41:04 UTC (838 KB)
[v3]
Wed, 25 Dec 2024 02:55:14 UTC (1,296 KB)

Source link
lol

By stp2y