Generative Kaleidoscopic Networks

stp2yOctober 23, 20240 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 19 Feb 2024 (v1), last revised 22 Oct 2024 (this version, v4)]

View a PDF of the paper titled Generative Kaleidoscopic Networks, by Harsh Shrivastava

View PDF
HTML (experimental)

Abstract:We discovered that the neural networks, especially the deep ReLU networks, demonstrate an `over-generalization’ phenomenon. That is, the output values for the inputs that were not seen during training are mapped close to the output range that were observed during the learning process. In other words, the neural networks learn a many-to-one mapping and this effect is more prominent as we increase the number of layers or the depth of the neural network. We utilize this property of neural networks to design a dataset kaleidoscope, termed as `Generative Kaleidoscopic Networks’. Succinctly, if we learn a model to map from input $xinmathbb{R}^D$ to itself $f_mathcal{N}(x)rightarrow x$, the proposed `Kaleidoscopic sampling’ procedure starts with a random input noise $zinmathbb{R}^D$ and recursively applies $f_mathcal{N}(cdots f_mathcal{N}(z)cdots )$. After a burn-in period duration, we start observing samples from the input distribution and the quality of samples recovered improves as we increase the depth of the model. Scope: We observed this phenomenon to various degrees for the other deep learning architectures like CNNs, Transformers & U-Nets and we are currently investigating them further.

Submission history

From: Harsh Shrivastava [view email]
[v1]
Mon, 19 Feb 2024 02:48:40 UTC (5,914 KB)
[v2]
Fri, 23 Feb 2024 22:13:53 UTC (7,448 KB)
[v3]
Tue, 27 Feb 2024 03:18:55 UTC (7,448 KB)
[v4]
Tue, 22 Oct 2024 04:15:49 UTC (7,451 KB)

Source link
lol

By stp2y