Improved Operator Learning by Orthogonal Attention

stp2yDecember 30, 20240 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 19 Oct 2023 (v1), last revised 26 Dec 2024 (this version, v4)]

View a PDF of the paper titled Improved Operator Learning by Orthogonal Attention, by Zipeng Xiao and 4 other authors

View PDF
HTML (experimental)

Abstract:Neural operators, as an efficient surrogate model for learning the solutions of PDEs, have received extensive attention in the field of scientific machine learning. Among them, attention-based neural operators have become one of the mainstreams in related research. However, existing approaches overfit the limited training data due to the considerable number of parameters in the attention mechanism. To address this, we develop an orthogonal attention based on the eigendecomposition of the kernel integral operator and the neural approximation of eigenfunctions. The orthogonalization naturally poses a proper regularization effect on the resulting neural operator, which aids in resisting overfitting and boosting generalization. Experiments on six standard neural operator benchmark datasets comprising both regular and irregular geometries show that our method can outperform competing baselines with decent margins.

Submission history

From: Zipeng Xiao [view email]
[v1]
Thu, 19 Oct 2023 05:47:28 UTC (977 KB)
[v2]
Mon, 23 Oct 2023 07:41:22 UTC (596 KB)
[v3]
Thu, 4 Jul 2024 07:20:40 UTC (588 KB)
[v4]
Thu, 26 Dec 2024 07:56:34 UTC (659 KB)

Source link
lol

By stp2y