Viral News

Step Out and Seek Around: On Warm-Start Training with Incremental Data

Step Out and Seek Around: On Warm-Start Training with Incremental Data

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
MoralBench: Moral Evaluation of LLMs

MoralBench: Moral Evaluation of LLMs

arXiv:2406.04428v1 Announce Type: new Abstract: In the rapidly evolving field of artificial intelligence, large language models (LLMs) have emerged as powerful tools for a myriad of applications, from natural language processing to decision-making support systems. However, as these models become increasingly integrated into societal frameworks, the imperative to ensure they operate within ethical and moral boundaries has never been more critical. This paper introduces a novel benchmark designed to measure and compare the moral reasoning capabilities of LLMs. We present the first comprehensive dataset specifically curated to probe the moral dimensions of LLM outputs, addressing a wide range of ethical…
Read More
Enhancing Supervised Visualization through Autoencoder and Random Forest Proximities for Out-of-Sample Extension

Enhancing Supervised Visualization through Autoencoder and Random Forest Proximities for Out-of-Sample Extension

arXiv:2406.04421v1 Announce Type: new Abstract: The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels. Common dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set. In this paper, we provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction method, RF-PHATE, combining information learned from the random forest model with the function-learning capabilities of autoencoders. Through quantitative assessment of various autoencoder architectures, we identify that networks that reconstruct random forest proximities are more robust for the embedding…
Read More
Evaluating Large Vision-Language Models’ Understanding of Real-World Complexities Through Synthetic Benchmarks

Evaluating Large Vision-Language Models’ Understanding of Real-World Complexities Through Synthetic Benchmarks

arXiv:2406.04470v1 Announce Type: new Abstract: This study assesses the ability of Large Vision-Language Models (LVLMs) to differentiate between AI-generated and human-generated images. It introduces a new automated benchmark construction method for this evaluation. The experiment compared common LVLMs with human participants using a mixed dataset of AI and human-created images. Results showed that LVLMs could distinguish between the image types to some extent but exhibited a rightward bias, and perform significantly worse compared to humans. To build on these findings, we developed an automated benchmark construction process using AI. This process involved topic retrieval, narrative script generation, error embedding, and…
Read More
Exploring the Latest LLMs for Leaderboard Extraction

Exploring the Latest LLMs for Leaderboard Extraction

[Submitted on 6 Jun 2024] View a PDF of the paper titled Exploring the Latest LLMs for Leaderboard Extraction, by Salomon Kabongo and 2 other authors View PDF HTML (experimental) Abstract:The rapid advancements in Large Language Models (LLMs) have opened new avenues for automating complex tasks in AI research. This paper investigates the efficacy of different LLMs-Mistral 7B, Llama-2, GPT-4-Turbo and GPT-4.o in extracting leaderboard information from empirical AI research articles. We explore three types of contextual inputs to the models: DocTAET (Document Title, Abstract, Experimental Setup, and Tabular Information), DocREC (Results, Experiments, and Conclusions), and DocFULL (entire document). Our…
Read More
TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification

TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification

arXiv:2406.04419v1 Announce Type: new Abstract: Time series classification (TSC) on multivariate time series is a critical problem. We propose a novel multi-view approach integrating frequency-domain and time-domain features to provide complementary contexts for TSC. Our method fuses continuous wavelet transform spectral features with temporal convolutional or multilayer perceptron features. We leverage the Mamba state space model for efficient and scalable sequence modeling. We also introduce a novel tango scanning scheme to better model sequence relationships. Experiments on 10 standard benchmark datasets demonstrate our approach achieves an average 6.45% accuracy improvement over state-of-the-art TSC models. Source link lol
Read More
DeTra: A Unified Model for Object Detection and Trajectory Forecasting

DeTra: A Unified Model for Object Detection and Trajectory Forecasting

arXiv:2406.04426v1 Announce Type: new Abstract: The tasks of object detection and trajectory forecasting play a crucial role in understanding the scene for autonomous driving. These tasks are typically executed in a cascading manner, making them prone to compounding errors. Furthermore, there is usually a very thin interface between the two tasks, creating a lossy information bottleneck. To address these challenges, our approach formulates the union of the two tasks as a trajectory refinement problem, where the first pose is the detection (current time), and the subsequent poses are the waypoints of the multiple forecasts (future time). To tackle this unified…
Read More
Phased Instruction Fine-Tuning for Large Language Models

Phased Instruction Fine-Tuning for Large Language Models

arXiv:2406.04371v1 Announce Type: new Abstract: Instruction Fine-Tuning, a method enhancing pre-trained language models' capabilities from mere next-word prediction to complex instruction following, often employs a one-off training approach on diverse instruction dataset. However, this method may not effectively enhance models' adherence to instructions due to the simultaneous handling of varying instruction complexities. To address this, we propose a novel phased instruction fine-tuning (Phased IFT) method, grounded in the hypothesis of progressive alignment, which posits that the transition of a pre-trained language model from simple next-word prediction to sophisticated instruction following is a gradual learning process. Specifically, we obtain the score…
Read More
Aligning Large Language Models with Self-generated Preference Data

Aligning Large Language Models with Self-generated Preference Data

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning

Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning

arXiv:2406.04413v1 Announce Type: new Abstract: Drawing upon StyleGAN's expressivity and disentangled latent space, existing 2D approaches employ textual prompting to edit facial images with different attributes. In contrast, 3D-aware approaches that generate faces at different target poses require attribute-specific classifiers, learning separate model weights for each attribute, and are not scalable for novel attributes. In this work, we propose an efficient, plug-and-play, 3D-aware face editing framework based on attribute-specific prompt learning, enabling the generation of facial images with controllable attributes across various target poses. To this end, we introduce a text-driven learnable style token-based latent attribute editor (LAE). The LAE…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.