Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders

stp2yDecember 25, 20240 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 23 Dec 2024 (v1), last revised 24 Dec 2024 (this version, v2)]

View a PDF of the paper titled Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders, by Rui Chen and 9 other authors

View PDF
HTML (experimental)

Abstract:Recent 3D content generation pipelines commonly employ Variational Autoencoders (VAEs) to encode shapes into compact latent representations for diffusion-based generation. However, the widely adopted uniform point sampling strategy in Shape VAE training often leads to a significant loss of geometric details, limiting the quality of shape reconstruction and downstream generation tasks. We present Dora-VAE, a novel approach that enhances VAE reconstruction through our proposed sharp edge sampling strategy and a dual cross-attention mechanism. By identifying and prioritizing regions with high geometric complexity during training, our method significantly improves the preservation of fine-grained shape features. Such sampling strategy and the dual attention mechanism enable the VAE to focus on crucial geometric details that are typically missed by uniform sampling approaches. To systematically evaluate VAE reconstruction quality, we additionally propose Dora-bench, a benchmark that quantifies shape complexity through the density of sharp edges, introducing a new metric focused on reconstruction accuracy at these salient geometric features. Extensive experiments on the Dora-bench demonstrate that Dora-VAE achieves comparable reconstruction quality to the state-of-the-art dense XCube-VAE while requiring a latent space at least 8$times$ smaller (1,280 vs. > 10,000 codes). We will release our code and benchmark dataset to facilitate future research in 3D shape modeling.

Submission history

From: Rui Chen [view email]
[v1]
Mon, 23 Dec 2024 18:59:06 UTC (25,026 KB)
[v2]
Tue, 24 Dec 2024 11:02:29 UTC (25,026 KB)

Source link
lol

By stp2y