23
Nov
[Submitted on 20 Nov 2024 (v1), last revised 21 Nov 2024 (this version, v2)] View a PDF of the paper titled Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning, by Andy Li and 3 other authors View PDF HTML (experimental) Abstract:Pruning of deep neural networks has been an effective technique for reducing model size while preserving most of the performance of dense networks, crucial for deploying models on memory and power-constrained devices. While recent sparse learning methods have shown promising performance up to moderate sparsity levels such as 95% and 98%, accuracy quickly deteriorates when pushing…