Multi-scale Semantic Prior Features Guided Deep Neural Network for Urban Street-view Image

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning


View a PDF of the paper titled Multi-scale Semantic Prior Features Guided Deep Neural Network for Urban Street-view Image, by Jianshun Zeng and 4 other authors

View PDF

Abstract:Street-view image has been widely applied as a crucial mobile mapping data source. The inpainting of street-view images is a critical step for street-view image processing, not only for the privacy protection, but also for the urban environment mapping applications. This paper presents a novel Deep Neural Network (DNN), multi-scale semantic prior Feature guided image inpainting Network (MFN) for inpainting street-view images, which generate static street-view images without moving objects (e.g., pedestrians, vehicles). To enhance global context understanding, a semantic prior prompter is introduced to learn rich semantic priors from large pre-trained model. We design the prompter by stacking multiple Semantic Pyramid Aggregation (SPA) modules, capturing a broad range of visual feature patterns. A semantic-enhanced image generator with a decoder is proposed that incorporates a novel cascaded Learnable Prior Transferring (LPT) module at each scale level. For each decoder block, an attention transfer mechanism is applied to capture long-term dependencies, and the semantic prior features are fused with the image features to restore plausible structure in an adaptive manner. Additionally, a background-aware data processing scheme is adopted to prevent the generation of hallucinated objects within holes. Experiments on Apolloscapes and Cityscapes datasets demonstrate better performance than state-of-the-art methods, with MAE, and LPIPS showing improvements of about 9.5% and 41.07% respectively. Visual comparison survey among multi-group person is also conducted to provide performance evaluation, and the results suggest that the proposed MFN offers a promising solution for privacy protection and generate more reliable scene for urban applications with street-view images.

Submission history

From: Jianshun Zeng [view email]
[v1]
Fri, 17 May 2024 03:02:18 UTC (13,198 KB)
[v2]
Wed, 18 Sep 2024 23:11:59 UTC (1,620 KB)



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.