Viral News

A dataset of questions on decision-theoretic reasoning in Newcomb-like problems

A dataset of questions on decision-theoretic reasoning in Newcomb-like problems

[Submitted on 15 Nov 2024] View a PDF of the paper titled A dataset of questions on decision-theoretic reasoning in Newcomb-like problems, by Caspar Oesterheld and Emery Cooper and Miles Kodama and Linh Chi Nguyen and Ethan Perez View PDF Abstract:We introduce a dataset of natural-language questions in the decision theory of so-called Newcomb-like problems. Newcomb-like problems include, for instance, decision problems in which an agent interacts with a similar other agent, and thus has to reason about the fact that the other agent will likely reason in similar ways. Evaluating LLM reasoning about Newcomb-like problems is important because interactions…
Read More
SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers

SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers

arXiv:2411.10510v1 Announce Type: new Abstract: Diffusion Transformers (DiT) have emerged as powerful generative models for various tasks, including image, video, and speech synthesis. However, their inference process remains computationally expensive due to the repeated evaluation of resource-intensive attention and feed-forward modules. To address this, we introduce SmoothCache, a model-agnostic inference acceleration technique for DiT architectures. SmoothCache leverages the observed high similarity between layer outputs across adjacent diffusion timesteps. By analyzing layer-wise representation errors from a small calibration set, SmoothCache adaptively caches and reuses key features during inference. Our experiments demonstrate that SmoothCache achieves 8% to 71% speed up while maintaining…
Read More
USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting

USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting

arXiv:2411.10504v1 Announce Type: new Abstract: Spike cameras, as an innovative neuromorphic camera that captures scenes with the 0-1 bit stream at 40 kHz, are increasingly employed for the 3D reconstruction task via Neural Radiance Fields (NeRF) or 3D Gaussian Splatting (3DGS). Previous spike-based 3D reconstruction approaches often employ a casecased pipeline: starting with high-quality image reconstruction from spike streams based on established spike-to-image reconstruction algorithms, then progressing to camera pose estimation and 3D reconstruction. However, this cascaded approach suffers from substantial cumulative errors, where quality limitations of initial image reconstructions negatively impact pose estimation, ultimately degrading the fidelity of the…
Read More
SAM Decoding: Speculative Decoding via Suffix Automaton

SAM Decoding: Speculative Decoding via Suffix Automaton

arXiv:2411.10666v1 Announce Type: new Abstract: Large Language Models (LLMs) have revolutionized natural language processing by unifying tasks into text generation, yet their large parameter sizes and autoregressive nature limit inference speed. SAM-Decoding addresses this by introducing a novel retrieval-based speculative decoding method that uses a suffix automaton for efficient and accurate draft generation. Unlike n-gram matching used by the existing method, SAM-Decoding finds the longest suffix match in generating text and text corpuss, achieving an average time complexity of $O(1)$ per generation step. SAM-Decoding constructs static and dynamic suffix automatons for the text corpus and input prompts, respectively, enabling fast…
Read More
RedTest: Towards Measuring Redundancy in Deep Neural Networks Effectively

RedTest: Towards Measuring Redundancy in Deep Neural Networks Effectively

arXiv:2411.10507v1 Announce Type: new Abstract: Deep learning has revolutionized computing in many real-world applications, arguably due to its remarkable performance and extreme convenience as an end-to-end solution. However, deep learning models can be costly to train and to use, especially for those large-scale models, making it necessary to optimize the original overly complicated models into smaller ones in scenarios with limited resources such as mobile applications or simply for resource saving. The key question in such model optimization is, how can we effectively identify and measure the redundancy in a deep learning model structure. While several common metrics exist in…
Read More
Everything is a Video: Unifying Modalities through Next-Frame Prediction

Everything is a Video: Unifying Modalities through Next-Frame Prediction

arXiv:2411.10503v1 Announce Type: new Abstract: Multimodal learning, which involves integrating information from various modalities such as text, images, audio, and video, is pivotal for numerous complex tasks like visual question answering, cross-modal retrieval, and caption generation. Traditional approaches rely on modality-specific encoders and late fusion techniques, which can hinder scalability and flexibility when adapting to new tasks or modalities. To address these limitations, we introduce a novel framework that extends the concept of task reformulation beyond natural language processing (NLP) to multimodal learning. We propose to reformulate diverse multimodal tasks into a unified next-frame prediction problem, allowing a single model…
Read More
Gender Bias Mitigation for Bangla Classification Tasks

Gender Bias Mitigation for Bangla Classification Tasks

arXiv:2411.10636v1 Announce Type: new Abstract: In this study, we investigate gender bias in Bangla pretrained language models, a largely under explored area in low-resource languages. To assess this bias, we applied gender-name swapping techniques to existing datasets, creating four manually annotated, task-specific datasets for sentiment analysis, toxicity detection, hate speech detection, and sarcasm detection. By altering names and gender-specific terms, we ensured these datasets were suitable for detecting and mitigating gender bias. We then proposed a joint loss optimization technique to mitigate gender bias across task-specific pretrained models. Our approach was evaluated against existing bias mitigation methods, with results showing…
Read More
Preparing Finance Data for AI: A 5-Step Data Cleansing Checklist

Preparing Finance Data for AI: A 5-Step Data Cleansing Checklist

AI implementation is a common practice for financial organizations looking for predictive analytics to enhance their decision-making and minimize business risks. However, the integrity of finance data used to train the AI/ML models plays an important role in ensuring the reliability of its outcomes. This is because AI algorithms need an immense amount of data to learn, evolve, and perform the desired actions. Any discrepancies in the input data result in flawed insights, inaccurate financial forecasting, and misguided business decisions. In the worst-case scenarios, the entire AI/ML model might go down into flames if the training data is of poor…
Read More
ARNN: Attentive Recurrent Neural Network for Multi-channel EEG Signals to Identify Epileptic Seizures

ARNN: Attentive Recurrent Neural Network for Multi-channel EEG Signals to Identify Epileptic Seizures

[Submitted on 5 Mar 2024 (v1), last revised 18 Nov 2024 (this version, v2)] View a PDF of the paper titled ARNN: Attentive Recurrent Neural Network for Multi-channel EEG Signals to Identify Epileptic Seizures, by Salim Rukhsar and Anil Kumar Tiwari View PDF HTML (experimental) Abstract:Electroencephalography (EEG) is a widely used tool for diagnosing brain disorders due to its high temporal resolution, non-invasive nature, and affordability. Manual analysis of EEG is labor-intensive and requires expertise, making automatic EEG interpretation crucial for reducing workload and accurately assessing seizures. In epilepsy diagnosis, prolonged EEG monitoring generates extensive data, often spanning hours, days,…
Read More
Diffusion-Based Semantic Segmentation of Lumbar Spine MRI Scans of Lower Back Pain Patients

Diffusion-Based Semantic Segmentation of Lumbar Spine MRI Scans of Lower Back Pain Patients

arXiv:2411.10755v1 Announce Type: cross Abstract: This study introduces a diffusion-based framework for robust and accurate segmenton of vertebrae, intervertebral discs (IVDs), and spinal canal from Magnetic Resonance Imaging~(MRI) scans of patients with low back pain (LBP), regardless of whether the scans are T1w or T2-weighted. The results showed that SpineSegDiff achieved comparable outperformed non-diffusion state-of-the-art models in the identification of degenerated IVDs. Our findings highlight the potential of diffusion models to improve LBP diagnosis and management through precise spine MRI analysis. Source link lol
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.