MotionBridge: Dynamic Video Inbetweening with Flexible Controls

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning


View a PDF of the paper titled MotionBridge: Dynamic Video Inbetweening with Flexible Controls, by Maham Tanveer and 6 other authors

View PDF
HTML (experimental)

Abstract:By generating plausible and smooth transitions between two image frames, video inbetweening is an essential tool for video editing and long video synthesis. Traditional works lack the capability to generate complex large motions. While recent video generation techniques are powerful in creating high-quality results, they often lack fine control over the details of intermediate frames, which can lead to results that do not align with the creative mind. We introduce MotionBridge, a unified video inbetweening framework that allows flexible controls, including trajectory strokes, keyframes, masks, guide pixels, and text. However, learning such multi-modal controls in a unified framework is a challenging task. We thus design two generators to extract the control signal faithfully and encode feature through dual-branch embedders to resolve ambiguities. We further introduce a curriculum training strategy to smoothly learn various controls. Extensive qualitative and quantitative experiments have demonstrated that such multi-modal controls enable a more dynamic, customizable, and contextually accurate visual narrative.

Submission history

From: Maham Tanveer [view email]
[v1]
Tue, 17 Dec 2024 18:59:33 UTC (13,270 KB)
[v2]
Mon, 23 Dec 2024 07:19:04 UTC (13,270 KB)
[v3]
Tue, 7 Jan 2025 22:06:07 UTC (13,271 KB)



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.