ZNorm: Z-Score Gradient Normalization Accelerating Skip-Connected Network Training without Architectural Modification

How to Evaluate an LLM's Ability to Follow Instructions


View a PDF of the paper titled ZNorm: Z-Score Gradient Normalization Accelerating Skip-Connected Network Training without Architectural Modification, by Juyoung Yun

View PDF
HTML (experimental)

Abstract:The rapid advancements in deep learning necessitate better training methods for deep neural networks (DNNs). As models grow in complexity, vanishing and exploding gradients impede performance, particularly in skip-connected architectures like Deep Residual Networks. We propose Z-Score Normalization for Gradient Descent (ZNorm), an innovative technique that adjusts only the gradients without modifying the network architecture to accelerate training and improve model performance. ZNorm normalizes the overall gradients, providing consistent gradient scaling across layers, effectively reducing the risks of vanishing and exploding gradients and achieving superior performance. Extensive experiments on CIFAR-10 and medical datasets confirm that ZNorm consistently outperforms existing methods under the same experimental settings. In medical imaging applications, ZNorm significantly enhances tumor prediction and segmentation accuracy, underscoring its practical utility. These findings highlight ZNorm’s potential as a robust and versatile tool for enhancing the training and effectiveness of deep neural networks, especially in skip-connected architectures, across various applications.

Submission history

From: Juyoung Yun [view email]
[v1]
Fri, 2 Aug 2024 12:04:19 UTC (1,066 KB)
[v2]
Tue, 10 Sep 2024 01:06:31 UTC (946 KB)
[v3]
Wed, 11 Sep 2024 05:44:54 UTC (938 KB)
[v4]
Thu, 19 Sep 2024 00:09:40 UTC (950 KB)
[v5]
Wed, 20 Nov 2024 08:54:05 UTC (2,707 KB)



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.