Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration

stp2yNovember 1, 20240 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 30 Sep 2024 (v1), last revised 31 Oct 2024 (this version, v3)]

View a PDF of the paper titled Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration, by Kaihang Pan and 8 other authors

View PDF
HTML (experimental)

Abstract:The swift advancement in Multimodal LLMs (MLLMs) also presents significant challenges for effective knowledge editing. Current methods, including intrinsic knowledge editing and external knowledge resorting, each possess strengths and weaknesses, struggling to balance the desired properties of reliability, generality, and locality when applied to MLLMs. In this paper, we propose UniKE, a novel multimodal editing method that establishes a unified perspective and paradigm for intrinsic knowledge editing and external knowledge resorting. Both types of knowledge are conceptualized as vectorized key-value memories, with the corresponding editing processes resembling the assimilation and accommodation phases of human cognition, conducted at the same semantic levels. Within such a unified framework, we further promote knowledge collaboration by disentangling the knowledge representations into the semantic and truthfulness spaces. Extensive experiments validate the effectiveness of our method, which ensures that the post-edit MLLM simultaneously maintains excellent reliability, generality, and locality. The code for UniKE is available at url{this https URL}.

Submission history

From: Kaihang Pan [view email]
[v1]
Mon, 30 Sep 2024 02:13:53 UTC (1,961 KB)
[v2]
Tue, 1 Oct 2024 07:34:25 UTC (1,961 KB)
[v3]
Thu, 31 Oct 2024 02:29:45 UTC (1,969 KB)

Source link
lol

By stp2y