An Adaptive and Stability-Promoting Layerwise Training Approach for Sparse Deep Neural Network Architecture

stp2ySeptember 24, 20240 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 13 Nov 2022 (v1), last revised 22 Sep 2024 (this version, v2)]

View a PDF of the paper titled An Adaptive and Stability-Promoting Layerwise Training Approach for Sparse Deep Neural Network Architecture, by C G Krishnanunni and Tan Bui-Thanh

View PDF
HTML (experimental)

Abstract:This work presents a two-stage adaptive framework for progressively developing deep neural network (DNN) architectures that generalize well for a given training data set. In the first stage, a layerwise training approach is adopted where a new layer is added each time and trained independently by freezing parameters in the previous layers. We impose desirable structures on the DNN by employing manifold regularization, sparsity regularization, and physics-informed terms. We introduce a epsilon-delta stability-promoting concept as a desirable property for a learning algorithm and show that employing manifold regularization yields a epsilon-delta stability-promoting algorithm. Further, we also derive the necessary conditions for the trainability of a newly added layer and investigate the training saturation problem. In the second stage of the algorithm (post-processing), a sequence of shallow networks is employed to extract information from the residual produced in the first stage, thereby improving the prediction accuracy. Numerical investigations on prototype regression and classification problems demonstrate that the proposed approach can outperform fully connected DNNs of the same size. Moreover, by equipping the physics-informed neural network (PINN) with the proposed adaptive architecture strategy to solve partial differential equations, we numerically show that adaptive PINNs not only are superior to standard PINNs but also produce interpretable hidden layers with provable stability. We also apply our architecture design strategy to solve inverse problems governed by elliptic partial differential equations.

Submission history

From: Chandradath Girija Krishnanunni [view email]
[v1]
Sun, 13 Nov 2022 09:51:16 UTC (3,871 KB)
[v2]
Sun, 22 Sep 2024 17:13:37 UTC (17,644 KB)

Source link
lol

By stp2y