Viral News

Faster Multi-GPU Training with PPLL: A Pipeline Parallelism Framework Leveraging Local Learning

Faster Multi-GPU Training with PPLL: A Pipeline Parallelism Framework Leveraging Local Learning

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
A Novel Approach to Eliminating Hallucinations in Large Language Model-Assisted Causal Discovery

A Novel Approach to Eliminating Hallucinations in Large Language Model-Assisted Causal Discovery

arXiv:2411.12759v1 Announce Type: new Abstract: The increasing use of large language models (LLMs) in causal discovery as a substitute for human domain experts highlights the need for optimal model selection. This paper presents the first hallucination survey of popular LLMs for causal discovery. We show that hallucinations exist when using LLMs in causal discovery so the choice of LLM is important. We propose using Retrieval Augmented Generation (RAG) to reduce hallucinations when quality data is available. Additionally, we introduce a novel method employing multiple LLMs with an arbiter in a debate to audit edges in causal graphs, achieving a comparable…
Read More
Surface Flux Transport Modeling using Physics Informed Neural Networks

Surface Flux Transport Modeling using Physics Informed Neural Networks

[Submitted on 3 Sep 2024 (v1), last revised 20 Nov 2024 (this version, v2)] View a PDF of the paper titled Surface Flux Transport Modeling using Physics Informed Neural Networks, by Jithu J Athalathil and 4 other authors View PDF Abstract:Studying the magnetic field properties on the solar surface is crucial for understanding the solar and heliospheric activities, which in turn shape space weather in the solar system. Surface Flux Transport (SFT) modeling helps us to simulate and analyse the transport and evolution of magnetic flux on the solar surface, providing valuable insights into the mechanisms responsible for solar activity.…
Read More
Word-level Sign Language Recognition with Multi-stream Neural Networks Focusing on Local Regions and Skeletal Information

Word-level Sign Language Recognition with Multi-stream Neural Networks Focusing on Local Regions and Skeletal Information

[Submitted on 30 Jun 2021 (v1), last revised 20 Nov 2024 (this version, v2)] View a PDF of the paper titled Word-level Sign Language Recognition with Multi-stream Neural Networks Focusing on Local Regions and Skeletal Information, by Mizuki Maruyama and 5 other authors View PDF HTML (experimental) Abstract:Word-level sign language recognition (WSLR) has attracted attention because it is expected to overcome the communication barrier between people with speech impairment and those who can hear. In the WSLR problem, a method designed for action recognition has achieved the state-of-the-art accuracy. Indeed, it sounds reasonable for an action recognition method to perform…
Read More
Leveraging Prior Experience: An Expandable Auxiliary Knowledge Base for Text-to-SQL

Leveraging Prior Experience: An Expandable Auxiliary Knowledge Base for Text-to-SQL

arXiv:2411.13244v1 Announce Type: new Abstract: Large Language Models (LLMs) exhibit impressive problem-solving skills across many tasks, but they still underperform compared to humans in various downstream applications, such as text-to-SQL. On the BIRD benchmark leaderboard, human performance achieves an accuracy of 92.96%, whereas the top-performing method reaches only 72.39%. Notably, these state-of-the-art (SoTA) methods predominantly rely on in-context learning to simulate human-like reasoning. However, they overlook a critical human skill: continual learning. Inspired by the educational practice of maintaining mistake notebooks during our formative years, we propose LPE-SQL (Leveraging Prior Experience: An Expandable Auxiliary Knowledge Base for Text-to-SQL), a novel…
Read More
Deriving Activation Functions via Integration

Deriving Activation Functions via Integration

arXiv:2411.13010v1 Announce Type: new Abstract: Activation functions play a crucial role in introducing non-linearities to deep neural networks. We propose a novel approach to designing activation functions by focusing on their gradients and deriving the corresponding functions through integration. Our work introduces the Expanded Integral of the Exponential Linear Unit (xIELU), a trainable piecewise activation function derived by integrating trainable affine transformations applied on the ELU activation function. xIELU combines two key gradient properties: a trainable and linearly increasing gradient for positive inputs, similar to ReLU$^2$, and a trainable negative gradient flow for negative inputs, akin to xSiLU. Conceptually, xIELU…
Read More
An Integrated Approach to Robotic Object Grasping and Manipulation

An Integrated Approach to Robotic Object Grasping and Manipulation

arXiv:2411.13205v1 Announce Type: cross Abstract: In response to the growing challenges of manual labor and efficiency in warehouse operations, Amazon has embarked on a significant transformation by incorporating robotics to assist with various tasks. While a substantial number of robots have been successfully deployed for tasks such as item transportation within warehouses, the complex process of object picking from shelves remains a significant challenge. This project addresses the issue by developing an innovative robotic system capable of autonomously fulfilling a simulated order by efficiently selecting specific items from shelves. A distinguishing feature of the proposed robotic system is its capacity…
Read More
On the Implicit Relation Between Low-Rank Adaptation and Differential Privacy

On the Implicit Relation Between Low-Rank Adaptation and Differential Privacy

[Submitted on 26 Sep 2024 (v1), last revised 19 Nov 2024 (this version, v3)] View a PDF of the paper titled On the Implicit Relation Between Low-Rank Adaptation and Differential Privacy, by Saber Malekmohammadi and 1 other authors View PDF HTML (experimental) Abstract:A significant approach in natural language processing involves large-scale pre-training models on general domain data followed by their adaptation to specific tasks or domains. As models grow in size, full fine-tuning all of their parameters becomes increasingly impractical. To address this, some methods for low-rank task adaptation of language models have been proposed, e.g., LoRA and FLoRA. These…
Read More
Generating Visual Stimuli from EEG Recordings using Transformer-encoder based EEG encoder and GAN

Generating Visual Stimuli from EEG Recordings using Transformer-encoder based EEG encoder and GAN

[Submitted on 15 Feb 2024 (v1), last revised 20 Nov 2024 (this version, v2)] View a PDF of the paper titled Generating Visual Stimuli from EEG Recordings using Transformer-encoder based EEG encoder and GAN, by Rahul Mishra and 1 other authors View PDF HTML (experimental) Abstract:In this study, we tackle a modern research challenge within the field of perceptual brain decoding, which revolves around synthesizing images from EEG signals using an adversarial deep learning framework. The specific objective is to recreate images belonging to various object categories by leveraging EEG recordings obtained while subjects view those images. To achieve this,…
Read More
Exploring the Low-Pass Filtering Behavior in Image Super-Resolution

Exploring the Low-Pass Filtering Behavior in Image Super-Resolution

[Submitted on 13 May 2024 (v1), last revised 20 Nov 2024 (this version, v4)] View a PDF of the paper titled Exploring the Low-Pass Filtering Behavior in Image Super-Resolution, by Haoyu Deng and 5 other authors View PDF HTML (experimental) Abstract:Deep neural networks for image super-resolution (ISR) have shown significant advantages over traditional approaches like the interpolation. However, they are often criticized as 'black boxes' compared to traditional approaches with solid mathematical foundations. In this paper, we attempt to interpret the behavior of deep neural networks in ISR using theories from the field of signal processing. First, we report an…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.