PreciseCam: Precise Camera Control for Text-to-Image Generation

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning


[Submitted on 22 Jan 2025]

View a PDF of the paper titled PreciseCam: Precise Camera Control for Text-to-Image Generation, by Edurne Bernal-Berdun and 6 other authors

View PDF
HTML (experimental)

Abstract:Images as an artistic medium often rely on specific camera angles and lens distortions to convey ideas or emotions; however, such precise control is missing in current text-to-image models. We propose an efficient and general solution that allows precise control over the camera when generating both photographic and artistic images. Unlike prior methods that rely on predefined shots, we rely solely on four simple extrinsic and intrinsic camera parameters, removing the need for pre-existing geometry, reference 3D objects, and multi-view data. We also present a novel dataset with more than 57,000 images, along with their text prompts and ground-truth camera parameters. Our evaluation shows precise camera control in text-to-image generation, surpassing traditional prompt engineering approaches. Our data, model, and code are publicly available at this https URL.

Submission history

From: Edurne Bernal-Berdun [view email]
[v1]
Wed, 22 Jan 2025 14:37:01 UTC (48,935 KB)



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.