Viral News

EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation

EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation

[Submitted on 28 Oct 2024 (v1), last revised 21 Nov 2024 (this version, v2)] Authors:Shih-Yang Liu, Huck Yang, Chien-Yi Wang, Nai Chit Fung, Hongxu Yin, Charbel Sakr, Saurav Muralidharan, Kwang-Ting Cheng, Jan Kautz, Yu-Chiang Frank Wang, Pavlo Molchanov, Min-Hung Chen View a PDF of the paper titled EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation, by Shih-Yang Liu and 11 other authors View PDF HTML (experimental) Abstract:In this work, we re-formulate the model compression problem into the customized compensation problem: Given a compressed model, we aim to introduce residual low-rank paths to compensate for compression errors under customized…
Read More
Exact and approximate error bounds for physics-informed neural networks

Exact and approximate error bounds for physics-informed neural networks

arXiv:2411.13848v1 Announce Type: new Abstract: The use of neural networks to solve differential equations, as an alternative to traditional numerical solvers, has increased recently. However, error bounds for the obtained solutions have only been developed for certain equations. In this work, we report important progress in calculating error bounds of physics-informed neural networks (PINNs) solutions of nonlinear first-order ODEs. We give a general expression that describes the error of the solution that the PINN-based method provides for a nonlinear first-order ODE. In addition, we propose a technique to calculate an approximate bound for the general case and an exact bound…
Read More
Principles of Visual Tokens for Efficient Video Understanding

Principles of Visual Tokens for Efficient Video Understanding

arXiv:2411.13626v1 Announce Type: new Abstract: Video understanding has made huge strides in recent years, relying largely on the power of the transformer architecture. As this architecture is notoriously expensive and video is highly redundant, research into improving efficiency has become particularly relevant. This has led to many creative solutions, including token merging and token selection. While most methods succeed in reducing the cost of the model and maintaining accuracy, an interesting pattern arises: most methods do not outperform the random sampling baseline. In this paper we take a closer look at this phenomenon and make several observations. First, we develop…
Read More
Meaning at the Planck scale? Contextualized word embeddings for doing history, philosophy, and sociology of science

Meaning at the Planck scale? Contextualized word embeddings for doing history, philosophy, and sociology of science

[Submitted on 21 Nov 2024] View a PDF of the paper titled Meaning at the Planck scale? Contextualized word embeddings for doing history, philosophy, and sociology of science, by Arno Simons View PDF HTML (experimental) Abstract:This paper explores the potential of contextualized word embeddings (CWEs) as a new tool in the history, philosophy, and sociology of science (HPSS) for studying contextual and evolving meanings of scientific concepts. Using the term "Planck" as a test case, I evaluate five BERT-based models with varying degrees of domain-specific pretraining, including my custom model Astro-HEP-BERT, trained on the Astro-HEP Corpus, a dataset containing 21.84…
Read More
Meet 2024 BigDATAwire Person to Watch Chisoo Lyons

Meet 2024 BigDATAwire Person to Watch Chisoo Lyons

Women make up 15% to 22% of data science professionals, research from the Boston Consulting Group shows. One of the women who is hoping to increase that number is Chisoo Lyons, the executive director of Women in Data Science Worldwide (WiDS) and a BigDATAwire 2024 Person to Watch. WiDS started in 2015 with a one-day technical conference at Stanford University to give women a voice in data science. Since then, the group has expanded significantly, and it now touches more than 150,000 participants across 160 countries via conferences, datathons, podcasts, upskilling workshops, its Next Gen outreach program, the WiDS Academy,…
Read More
The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning

The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning

[Submitted on 13 Sep 2023 (v1), last revised 21 Nov 2024 (this version, v2)] View a PDF of the paper titled The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning, by Alexander Bastounis and 7 other authors View PDF HTML (experimental) Abstract:In this work, we assess the theoretical limitations of determining guaranteed stability and accuracy of neural networks in classification tasks. We consider classical distribution-agnostic framework and algorithms minimising empirical risks and potentially subjected to some weights regularisation. We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural…
Read More
Adversarial Poisoning Attack on Quantum Machine Learning Models

Adversarial Poisoning Attack on Quantum Machine Learning Models

arXiv:2411.14412v1 Announce Type: cross Abstract: With the growing interest in Quantum Machine Learning (QML) and the increasing availability of quantum computers through cloud providers, addressing the potential security risks associated with QML has become an urgent priority. One key concern in the QML domain is the threat of data poisoning attacks in the current quantum cloud setting. Adversarial access to training data could severely compromise the integrity and availability of QML models. Classical data poisoning techniques require significant knowledge and training to generate poisoned data, and lack noise resilience, making them ineffective for QML models in the Noisy Intermediate Scale…
Read More
The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims

The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims

arXiv:2411.14072v1 Announce Type: new Abstract: In order to solve the problem of insufficient generation quality caused by traditional patent text abstract generation models only originating from patent specifications, the problem of new terminology OOV caused by rapid patent updates, and the problem of information redundancy caused by insufficient consideration of the high professionalism, accuracy, and uniqueness of patent texts, we proposes a patent text abstract generation model (MSEA) based on a master-slave encoder architecture; Firstly, the MSEA model designs a master-slave encoder, which combines the instructions in the patent text with the claims as input, and fully explores the characteristics…
Read More
Heterophilic Graph Neural Networks Optimization with Causal Message-passing

Heterophilic Graph Neural Networks Optimization with Causal Message-passing

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
Unsupervised Foundation Model-Agnostic Slide-Level Representation Learning

Unsupervised Foundation Model-Agnostic Slide-Level Representation Learning

arXiv:2411.13623v1 Announce Type: new Abstract: Representation learning of pathology whole-slide images (WSIs) has primarily relied on weak supervision with Multiple Instance Learning (MIL). This approach leads to slide representations highly tailored to a specific clinical task. Self-supervised learning (SSL) has been successfully applied to train histopathology foundation models (FMs) for patch embedding generation. However, generating patient or slide level embeddings remains challenging. Existing approaches for slide representation learning extend the principles of SSL from patch level learning to entire slides by aligning different augmentations of the slide or by utilizing multimodal data. By integrating tile embeddings from multiple FMs, we…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.