Viral News

SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models using Soft-Thresholding Mechanism

SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models using Soft-Thresholding Mechanism

[Submitted on 15 Nov 2024] View a PDF of the paper titled SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models using Soft-Thresholding Mechanism, by Priyansh Bhatnagar and 2 other authors View PDF HTML (experimental) Abstract:Extensive efforts have been made to boost the performance in the domain of language models by introducing various attention-based transformers. However, the inclusion of linear layers with large dimensions contributes to significant computational and memory overheads. The escalating computational demands of these models necessitate the development of various compression techniques to ensure their deployment on devices, particularly in resource-constrained environments. In this paper, we propose a…
Read More
TESGNN: Temporal Equivariant Scene Graph Neural Networks for Efficient and Robust Multi-View 3D Scene Understanding

TESGNN: Temporal Equivariant Scene Graph Neural Networks for Efficient and Robust Multi-View 3D Scene Understanding

arXiv:2411.10509v1 Announce Type: new Abstract: Scene graphs have proven to be highly effective for various scene understanding tasks due to their compact and explicit representation of relational information. However, current methods often overlook the critical importance of preserving symmetry when generating scene graphs from 3D point clouds, which can lead to reduced accuracy and robustness, particularly when dealing with noisy, multi-view data. This work, to the best of our knowledge, presents the first implementation of an Equivariant Scene Graph Neural Network (ESGNN) to generate semantic scene graphs from 3D point clouds, specifically for enhanced scene understanding. Furthermore, a significant limitation…
Read More
HJ-Ky-0.1: an Evaluation Dataset for Kyrgyz Word Embeddings

HJ-Ky-0.1: an Evaluation Dataset for Kyrgyz Word Embeddings

arXiv:2411.10724v1 Announce Type: new Abstract: One of the key tasks in modern applied computational linguistics is constructing word vector representations (word embeddings), which are widely used to address natural language processing tasks such as sentiment analysis, information extraction, and more. To choose an appropriate method for generating these word embeddings, quality assessment techniques are often necessary. A standard approach involves calculating distances between vectors for words with expert-assessed 'similarity'. This work introduces the first 'silver standard' dataset for such tasks in the Kyrgyz language, alongside training corresponding models and validating the dataset's suitability through quality evaluation metrics. Source link lol
Read More
Evaluating Synthetic Activations composed of SAE Latents in GPT-2

Evaluating Synthetic Activations composed of SAE Latents in GPT-2

[Submitted on 23 Sep 2024 (v1), last revised 18 Nov 2024 (this version, v2)] View a PDF of the paper titled Evaluating Synthetic Activations composed of SAE Latents in GPT-2, by Giorgi Giglemiani and 4 other authors View PDF HTML (experimental) Abstract:Sparse Auto-Encoders (SAEs) are commonly employed in mechanistic interpretability to decompose the residual stream into monosemantic SAE latents. Recent work demonstrates that perturbing a model's activations at an early layer results in a step-function-like change in the model's final layer activations. Furthermore, the model's sensitivity to this perturbation differs between model-generated (real) activations and random activations. In our study,…
Read More
Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

[Submitted on 11 Nov 2024 (v1), last revised 17 Nov 2024 (this version, v2)] View a PDF of the paper titled Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models, by Yanchen Wang and 8 other authors View PDF HTML (experimental) Abstract:Neural decoding, the process of understanding how brain activity corresponds to different stimuli, has been a primary objective in cognitive sciences. Over the past three decades, advancements in functional Magnetic Resonance Imaging and machine learning have greatly improved our ability to map visual stimuli to brain activity, especially in the visual cortex. Concurrently, research has…
Read More
Structured Dialogue System for Mental Health: An LLM Chatbot Leveraging the PM+ Guidelines

Structured Dialogue System for Mental Health: An LLM Chatbot Leveraging the PM+ Guidelines

arXiv:2411.10681v1 Announce Type: new Abstract: The Structured Dialogue System, referred to as SuDoSys, is an innovative Large Language Model (LLM)-based chatbot designed to provide psychological counseling. SuDoSys leverages the World Health Organization (WHO)'s Problem Management Plus (PM+) guidelines to deliver stage-aware multi-turn dialogues. Existing methods for employing an LLM in multi-turn psychological counseling typically involve direct fine-tuning using generated dialogues, often neglecting the dynamic stage shifts of counseling sessions. Unlike previous approaches, SuDoSys considers the different stages of counseling and stores essential information throughout the counseling process, ensuring coherent and directed conversations. The system employs an LLM, a stage-aware instruction…
Read More
Tips on Building a Winning Data and AI Strategy from JPMC

Tips on Building a Winning Data and AI Strategy from JPMC

(Lewis-Tse/Shutterstock) With $274 billion in revenue last year and $3.3 trillion in assets under management, JPMorgan Chase has more resources than most to devote to building a winning data and AI strategy. But as James Massa, JPMorgan Chase’s senior executive director of software engineering and architecture, explained during his SolixEmpower keynote last week, even the biggest companies in the world must pay close attention to the data and AI details in order to succeed. In his Solix Empower 2024 keynote address, titled “Data Quality and Data Strategy for AI, Measuring AI Value, Testing LLMs, and AI Use Cases,” Massa provided…
Read More
On the Privacy Risk of In-context Learning

On the Privacy Risk of In-context Learning

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
DR-BFR: Degradation Representation with Diffusion Models for Blind Face Restoration

DR-BFR: Degradation Representation with Diffusion Models for Blind Face Restoration

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
IntentGPT: Few-shot Intent Discovery with Large Language Models

IntentGPT: Few-shot Intent Discovery with Large Language Models

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.