RESTOR: Knowledge Recovery through Machine Unlearning

stp2yJanuary 6, 20250 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 31 Oct 2024 (v1), last revised 2 Jan 2025 (this version, v2)]

View a PDF of the paper titled RESTOR: Knowledge Recovery through Machine Unlearning, by Keivan Rezaei and 5 other authors

View PDF
HTML (experimental)

Abstract:Large language models trained on web-scale corpora can memorize undesirable datapoints such as incorrect facts, copyrighted content or sensitive data. Recently, many machine unlearning algorithms have been proposed that aim to `erase’ these datapoints from trained models — that is, revert model behavior to be similar to a model that had never been trained on these datapoints. However, evaluating the success of unlearning algorithms remains an open challenge. In this work, we propose the RESTOR framework for machine unlearning, which evaluates the ability of unlearning algorithms to perform targeted data erasure from models, by evaluating the ability of models to forget the knowledge introduced in these data points, while simultaneously recovering the model’s knowledge state had it not encountered these datapoints. RESTOR helps uncover several novel insights about popular unlearning algorithms, and the mechanisms through which they operate — for instance, identifying that some algorithms merely emphasize forgetting, and that localizing unlearning targets can enhance unlearning performance.

Submission history

From: Keivan Rezaei [view email]
[v1]
Thu, 31 Oct 2024 20:54:35 UTC (1,858 KB)
[v2]
Thu, 2 Jan 2025 20:36:44 UTC (3,589 KB)

Source link
lol

By stp2y