Adversarial Detection by Approximation of Ensemble Boundary

stp2yJanuary 14, 20250 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 18 Nov 2022 (v1), last revised 10 Jan 2025 (this version, v5)]

View a PDF of the paper titled Adversarial Detection by Approximation of Ensemble Boundary, by T. Windeatt

View PDF
HTML (experimental)

Abstract:Despite being effective in many application areas, Deep Neural Networks (DNNs) are vulnerable to being attacked. In object recognition, the attack takes the form of a small perturbation added to an image, that causes the DNN to misclassify, but to a human appears no different. Adversarial attacks lead to defences that are themselves subject to attack, and the attack/ defence strategies provide important information about the properties of DNNs. In this paper, a novel method of detecting adversarial attacks is proposed for an ensemble of Deep Neural Networks (DNNs) solving two-class pattern recognition problems. The ensemble is combined using Walsh coefficients which are capable of approximating Boolean functions and thereby controlling the decision boundary complexity. The hypothesis in this paper is that decision boundaries with high curvature allow adversarial perturbations to be found, but change the curvature of the decision boundary, which is then approximated in a different way by Walsh coefficients compared to the clean images. Besides controlling boundary complexity, the coefficients also measure the correlation with class labels, which may aid in understanding the learning and transferability properties of DNNs. While the experiments here use images, the proposed approach of modelling two-class ensemble decision boundaries could in principle be applied to any application area.

Submission history

From: Terry Windeatt [view email]
[v1]
Fri, 18 Nov 2022 13:26:57 UTC (920 KB)
[v2]
Mon, 28 Nov 2022 16:39:49 UTC (667 KB)
[v3]
Tue, 13 Dec 2022 16:03:26 UTC (617 KB)
[v4]
Wed, 24 Jan 2024 11:38:25 UTC (414 KB)
[v5]
Fri, 10 Jan 2025 16:08:47 UTC (674 KB)

Source link
lol

By stp2y