Correcting misinformation on social media with a large language model

stp2ySeptember 4, 20240 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 17 Mar 2024 (v1), last revised 3 Sep 2024 (this version, v4)]

View a PDF of the paper titled Correcting misinformation on social media with a large language model, by Xinyi Zhou and 3 other authors

View PDF
HTML (experimental)

Abstract:Real-world misinformation, often multimodal, can be partially or fully factual but misleading using diverse tactics like conflating correlation with causation. Such misinformation is severely understudied, challenging to address, and harms various social domains, particularly on social media, where it can spread rapidly. High-quality and timely correction of misinformation that identifies and explains its (in)accuracies effectively reduces false beliefs. Despite the wide acceptance of manual correction, it is difficult to be timely and scalable. While LLMs have versatile capabilities that could accelerate misinformation correction, they struggle due to a lack of recent information, a tendency to produce false content, and limitations in addressing multimodal information. We propose MUSE, an LLM augmented with access to and credibility evaluation of up-to-date information. By retrieving evidence as refutations or supporting context, MUSE identifies and explains content (in)accuracies with references. It conducts multimodal retrieval and interprets visual content to verify and correct multimodal content. Given the absence of a comprehensive evaluation approach, we propose 13 dimensions of misinformation correction quality. Then, fact-checking experts evaluate responses to social media content that are not presupposed to be misinformation but broadly include (partially) incorrect and correct posts that may (not) be misleading. Results demonstrate MUSE’s ability to write high-quality responses to potential misinformation–across modalities, tactics, domains, political leanings, and for information that has not previously been fact-checked online–within minutes of its appearance on social media. Overall, MUSE outperforms GPT-4 by 37% and even high-quality responses from laypeople by 29%. Our work provides a general methodological and evaluative framework to correct misinformation at scale.

Submission history

From: Xinyi Zhou [view email]
[v1]
Sun, 17 Mar 2024 10:59:09 UTC (28,311 KB)
[v2]
Sat, 6 Apr 2024 08:49:31 UTC (32,292 KB)
[v3]
Tue, 30 Apr 2024 20:03:13 UTC (30,772 KB)
[v4]
Tue, 3 Sep 2024 05:51:40 UTC (36,927 KB)

Source link
lol

By stp2y