Assessing Adversarial Robustness of Large Language Models: An Empirical Study

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning


View a PDF of the paper titled Assessing Adversarial Robustness of Large Language Models: An Empirical Study, by Zeyu Yang and 3 other authors

View PDF
HTML (experimental)

Abstract:Large Language Models (LLMs) have revolutionized natural language processing, but their robustness against adversarial attacks remains a critical concern. We presents a novel white-box style attack approach that exposes vulnerabilities in leading open-source LLMs, including Llama, OPT, and T5. We assess the impact of model size, structure, and fine-tuning strategies on their resistance to adversarial perturbations. Our comprehensive evaluation across five diverse text classification tasks establishes a new benchmark for LLM robustness. The findings of this study have far-reaching implications for the reliable deployment of LLMs in real-world applications and contribute to the advancement of trustworthy AI systems.

Submission history

From: Xiaochen Zheng [view email]
[v1]
Sat, 4 May 2024 22:00:28 UTC (522 KB)
[v2]
Thu, 12 Sep 2024 22:18:03 UTC (459 KB)



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.