Viral News

Analysis of Argument Structure Constructions in a Deep Recurrent Language Model

Analysis of Argument Structure Constructions in a Deep Recurrent Language Model

arXiv:2408.03062v1 Announce Type: new Abstract: Understanding how language and linguistic constructions are processed in the brain is a fundamental question in cognitive computational neuroscience. In this study, we explore the representation and processing of Argument Structure Constructions (ASCs) in a recurrent neural language model. We trained a Long Short-Term Memory (LSTM) network on a custom-made dataset consisting of 2000 sentences, generated using GPT-4, representing four distinct ASCs: transitive, ditransitive, caused-motion, and resultative constructions. We analyzed the internal activations of the LSTM model's hidden layers using Multidimensional Scaling (MDS) and t-Distributed Stochastic Neighbor Embedding (t-SNE) to visualize the sentence representations. The…
Read More
KAN based Autoencoders for Factor Models

KAN based Autoencoders for Factor Models

arXiv:2408.02694v1 Announce Type: new Abstract: Inspired by recent advances in Kolmogorov-Arnold Networks (KANs), we introduce a novel approach to latent factor conditional asset pricing models. While previous machine learning applications in asset pricing have predominantly used Multilayer Perceptrons with ReLU activation functions to model latent factor exposures, our method introduces a KAN-based autoencoder which surpasses MLP models in both accuracy and interpretability. Our model offers enhanced flexibility in approximating exposures as nonlinear functions of asset characteristics, while simultaneously providing users with an intuitive framework for interpreting latent factors. Empirical backtesting demonstrates our model's superior ability to explain cross-sectional risk exposures.…
Read More
DaCapo: a modular deep learning framework for scalable 3D image segmentation

DaCapo: a modular deep learning framework for scalable 3D image segmentation

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
L3iTC at the FinLLM Challenge Task: Quantization for Financial Text Classification & Summarization

L3iTC at the FinLLM Challenge Task: Quantization for Financial Text Classification & Summarization

arXiv:2408.03033v1 Announce Type: new Abstract: This article details our participation (L3iTC) in the FinLLM Challenge Task 2024, focusing on two key areas: Task 1, financial text classification, and Task 2, financial text summarization. To address these challenges, we fine-tuned several large language models (LLMs) to optimize performance for each task. Specifically, we used 4-bit quantization and LoRA to determine which layers of the LLMs should be trained at a lower precision. This approach not only accelerated the fine-tuning process on the training data provided by the organizers but also enabled us to run the models on low GPU memory. Our…
Read More
Attention is all you need for an improved CNN-based flash flood susceptibility modeling. The case of the ungauged Rheraya watershed, Morocco

Attention is all you need for an improved CNN-based flash flood susceptibility modeling. The case of the ungauged Rheraya watershed, Morocco

arXiv:2408.02692v1 Announce Type: new Abstract: Effective flood hazard management requires evaluating and predicting flash flood susceptibility. Convolutional neural networks (CNNs) are commonly used for this task but face issues like gradient explosion and overfitting. This study explores the use of an attention mechanism, specifically the convolutional block attention module (CBAM), to enhance CNN models for flash flood susceptibility in the ungauged Rheraya watershed, a flood prone region. We used ResNet18, DenseNet121, and Xception as backbone architectures, integrating CBAM at different locations. Our dataset included 16 conditioning factors and 522 flash flood inventory points. Performance was evaluated using accuracy, precision, recall,…
Read More
Gaussian Mixture based Evidential Learning for Stereo Matching

Gaussian Mixture based Evidential Learning for Stereo Matching

arXiv:2408.02796v1 Announce Type: new Abstract: In this paper, we introduce a novel Gaussian mixture based evidential learning solution for robust stereo matching. Diverging from previous evidential deep learning approaches that rely on a single Gaussian distribution, our framework posits that individual image data adheres to a mixture-of-Gaussian distribution in stereo matching. This assumption yields more precise pixel-level predictions and more accurately mirrors the real-world image distribution. By further employing the inverse-Gamma distribution as an intermediary prior for each mixture component, our probabilistic model achieves improved depth estimation compared to its counterpart with the single Gaussian and effectively captures the model…
Read More
Fact Finder — Enhancing Domain Expertise of Large Language Models by Incorporating Knowledge Graphs

Fact Finder — Enhancing Domain Expertise of Large Language Models by Incorporating Knowledge Graphs

arXiv:2408.03010v1 Announce Type: new Abstract: Recent advancements in Large Language Models (LLMs) have showcased their proficiency in answering natural language queries. However, their effectiveness is hindered by limited domain-specific knowledge, raising concerns about the reliability of their responses. We introduce a hybrid system that augments LLMs with domain-specific knowledge graphs (KGs), thereby aiming to enhance factual correctness using a KG-based retrieval approach. We focus on a medical KG to demonstrate our methodology, which includes (1) pre-processing, (2) Cypher query generation, (3) Cypher query processing, (4) KG retrieval, and (5) LLM-enhanced response generation. We evaluate our system on a curated dataset…
Read More
Symmetric Graph Contrastive Learning against Noisy Views for Recommendation

Symmetric Graph Contrastive Learning against Noisy Views for Recommendation

arXiv:2408.02691v1 Announce Type: new Abstract: Graph Contrastive Learning (GCL) leverages data augmentation techniques to produce contrasting views, enhancing the accuracy of recommendation systems through learning the consistency between contrastive views. However, existing augmentation methods, such as directly perturbing interaction graph (e.g., node/edge dropout), may interfere with the original connections and generate poor contrasting views, resulting in sub-optimal performance. In this paper, we define the views that share only a small amount of information with the original graph due to poor data augmentation as noisy views (i.e., the last 20% of the views with a cosine similarity value less than 0.1…
Read More
Lesion Elevation Prediction from Skin Images Improves Diagnosis

Lesion Elevation Prediction from Skin Images Improves Diagnosis

arXiv:2408.02792v1 Announce Type: new Abstract: While deep learning-based computer-aided diagnosis for skin lesion image analysis is approaching dermatologists' performance levels, there are several works showing that incorporating additional features such as shape priors, texture, color constancy, and illumination further improves the lesion diagnosis performance. In this work, we look at another clinically useful feature, skin lesion elevation, and investigate the feasibility of predicting and leveraging skin lesion elevation labels. Specifically, we use a deep learning model to predict image-level lesion elevation labels from 2D skin lesion images. We test the elevation prediction accuracy on the derm7pt dataset, and use the…
Read More
Empathy Level Alignment via Reinforcement Learning for Empathetic Response Generation

Empathy Level Alignment via Reinforcement Learning for Empathetic Response Generation

arXiv:2408.02976v1 Announce Type: new Abstract: Empathetic response generation, aiming at understanding the user's situation and feelings and respond empathically, is crucial in building human-like dialogue systems. Previous methods mainly focus on using maximum likelihood estimation as the optimization objective for training response generation models, without taking into account the empathy level alignment between generated responses and target responses. To this end, we propose an empathetic response generation using reinforcement learning (EmpRL) framework. The framework designs an effective empathy reward function and generates empathetic responses by maximizing the expected reward through reinforcement learning. Given the powerful text generation capability of pre-trained…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.