Viral News

Sparse Input View Synthesis: 3D Representations and Reliable Priors

Sparse Input View Synthesis: 3D Representations and Reliable Priors

[Submitted on 20 Nov 2024] View a PDF of the paper titled Sparse Input View Synthesis: 3D Representations and Reliable Priors, by Nagabhushan Somraj View PDF Abstract:Novel view synthesis refers to the problem of synthesizing novel viewpoints of a scene given the images from a few viewpoints. This is a fundamental problem in computer vision and graphics, and enables a vast variety of applications such as meta-verse, free-view watching of events, video gaming, video stabilization and video compression. Recent 3D representations such as radiance fields and multi-plane images significantly improve the quality of images rendered from novel viewpoints. However, these…
Read More
Learning from “Silly” Questions Improves Large Language Models, But Only Slightly

Learning from “Silly” Questions Improves Large Language Models, But Only Slightly

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe

Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe

[Submitted on 6 Jun 2024 (v1), last revised 21 Nov 2024 (this version, v2)] View a PDF of the paper titled Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe, by Alicja Ziarko and 5 other authors View PDF HTML (experimental) Abstract:Text embeddings are essential for many tasks, such as document retrieval, clustering, and semantic similarity assessment. In this paper, we study how to contrastively train text embedding models in a compute-optimal fashion, given a suite of pre-trained decoder-only language models. Our innovation is an algorithm that produces optimal configurations of model sizes, data quantities, and fine-tuning methods for…
Read More
Effective Message Hiding with Order-Preserving Mechanisms

Effective Message Hiding with Order-Preserving Mechanisms

[Submitted on 29 Feb 2024 (v1), last revised 21 Nov 2024 (this version, v4)] View a PDF of the paper titled Effective Message Hiding with Order-Preserving Mechanisms, by Gao Yu and 2 other authors View PDF HTML (experimental) Abstract:Message hiding, a technique that conceals secret message bits within a cover image, aims to achieve an optimal balance among message capacity, recovery accuracy, and imperceptibility. While convolutional neural networks have notably improved message capacity and imperceptibility, achieving high recovery accuracy remains challenging. This challenge arises because convolutional operations struggle to preserve the sequential order of message bits and effectively address the…
Read More
SRA-MCTS: Self-driven Reasoning Augmentation with Monte Carlo Tree Search for Code Generation

SRA-MCTS: Self-driven Reasoning Augmentation with Monte Carlo Tree Search for Code Generation

[Submitted on 17 Nov 2024 (v1), last revised 21 Nov 2024 (this version, v3)] View a PDF of the paper titled SRA-MCTS: Self-driven Reasoning Augmentation with Monte Carlo Tree Search for Code Generation, by Bin Xu and Yiguan Lin and Yinghao Li and Yang Gao View PDF HTML (experimental) Abstract:Large language models demonstrate exceptional performance in simple code generation tasks but still face challenges in tackling complex problems. These challenges may stem from insufficient reasoning and problem decomposition capabilities. To address this issue, we propose a reasoning-augmented data generation process, SRA-MCTS, which guides the model to autonomously generate high-quality intermediate…
Read More
Robust Detection of Watermarks for Large Language Models Under Human Edits

Robust Detection of Watermarks for Large Language Models Under Human Edits

arXiv:2411.13868v1 Announce Type: new Abstract: Watermarking has offered an effective approach to distinguishing text generated by large language models (LLMs) from human-written text. However, the pervasive presence of human edits on LLM-generated text dilutes watermark signals, thereby significantly degrading detection performance of existing methods. In this paper, by modeling human edits through mixture model detection, we introduce a new method in the form of a truncated goodness-of-fit test for detecting watermarked text under human edits, which we refer to as Tr-GoF. We prove that the Tr-GoF test achieves optimality in robust detection of the Gumbel-max watermark in a certain asymptotic…
Read More
MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection

MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection

arXiv:2411.13628v1 Announce Type: new Abstract: Utilizing temporal information to improve the performance of 3D detection has made great progress recently in the field of autonomous driving. Traditional transformer-based temporal fusion methods suffer from quadratic computational cost and information decay as the length of the frame sequence increases. In this paper, we propose a novel method called MambaDETR, whose main idea is to implement temporal fusion in the efficient state space. Moreover, we design a Motion Elimination module to remove the relatively static objects for temporal fusion. On the standard nuScenes benchmark, our proposed MambaDETR achieves remarkable result in the 3D…
Read More
Lost in Inference: Rediscovering the Role of Natural Language Inference for Large Language Models

Lost in Inference: Rediscovering the Role of Natural Language Inference for Large Language Models

arXiv:2411.14103v1 Announce Type: new Abstract: In the recent past, a popular way of evaluating natural language understanding (NLU), was to consider a model's ability to perform natural language inference (NLI) tasks. In this paper, we investigate if NLI tasks, that are rarely used for LLM evaluation, can still be informative for evaluating LLMs. Focusing on five different NLI benchmarks across six models of different scales, we investigate if they are able to discriminate models of different size and quality and how their accuracies develop during training. Furthermore, we investigate the extent to which the softmax distributions of models align with…
Read More
Closing the Gap Between SGP4 and High-Precision Propagation via Differentiable Programming

Closing the Gap Between SGP4 and High-Precision Propagation via Differentiable Programming

[Submitted on 7 Feb 2024 (v1), last revised 20 Nov 2024 (this version, v5)] View a PDF of the paper titled Closing the Gap Between SGP4 and High-Precision Propagation via Differentiable Programming, by Giacomo Acciarini and 2 other authors View PDF HTML (experimental) Abstract:The Simplified General Perturbations 4 (SGP4) orbital propagation method is widely used for predicting the positions and velocities of Earth-orbiting objects rapidly and reliably. Despite continuous refinement, SGP models still lack the precision of numerical propagators, which offer significantly smaller errors. This study presents dSGP4, a novel differentiable version of SGP4 implemented using PyTorch. By making SGP4…
Read More
Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models

Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models

[Submitted on 10 Oct 2023 (v1), last revised 21 Nov 2024 (this version, v4)] View a PDF of the paper titled Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models, by Fei Shen and 5 other authors View PDF HTML (experimental) Abstract:Recent work has showcased the significant potential of diffusion models in pose-guided person image synthesis. However, owing to the inconsistency in pose between the source and target images, synthesizing an image with a distinct pose, relying exclusively on the source image and target pose information, remains a formidable challenge. This paper presents Progressive Conditional Diffusion Models (PCDMs) that incrementally…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.