MC-LLaVA: Multi-Concept Personalized Vision-Language Model

Enhancing GitHub Actions CI for FastAPI: Build, Test, and Publish - PyImageSearch


View a PDF of the paper titled MC-LLaVA: Multi-Concept Personalized Vision-Language Model, by Ruichuan An and 10 other authors

View PDF
HTML (experimental)

Abstract:Current vision-language models (VLMs) show exceptional abilities across diverse tasks including visual question answering. To enhance user experience in practical applications, recent studies investigate VLM personalization to understand user-provided concepts. However, existing studies mainly focus on single-concept personalization, neglecting the existence and interplay of multiple concepts, which limits the real-world applicability of personalized VLMs. In this paper, we propose the first multi-concept personalization method named MC-LLaVA along with a high-quality multi-concept personalization dataset. Specifically, MC-LLaVA uses a joint training strategy incorporating multiple concepts in a single training step, allowing VLMs to perform accurately in multi-concept personalization. To reduce the cost of joint training, MC-LLaVA leverages visual token information for concept token initialization, yielding improved concept representation and accelerating joint training. To advance multi-concept personalization research, we further contribute a high-quality dataset. We carefully collect images from various movies that contain multiple characters and manually generate the multi-concept question-answer samples. Our dataset features diverse movie types and question-answer types. We conduct comprehensive qualitative and quantitative experiments to demonstrate that MC-LLaVA can achieve impressive multi-concept personalized responses, paving the way for VLMs to become better user-specific assistants. The code and dataset will be publicly available at this https URL.

Submission history

From: Ruichuan An [view email]
[v1]
Mon, 18 Nov 2024 16:33:52 UTC (26,341 KB)
[v2]
Thu, 5 Dec 2024 13:27:22 UTC (41,935 KB)



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.