Viral News

Large Language Models for Power Scheduling: A User-Centric Approach

Large Language Models for Power Scheduling: A User-Centric Approach

[Submitted on 29 Jun 2024 (v1), last revised 14 Nov 2024 (this version, v3)] View a PDF of the paper titled Large Language Models for Power Scheduling: A User-Centric Approach, by Thomas Mongaillard and 8 other authors View PDF HTML (experimental) Abstract:While traditional optimization and scheduling schemes are designed to meet fixed, predefined system requirements, future systems are moving toward user-driven approaches and personalized services, aiming to achieve high quality-of-experience (QoE) and flexibility. This challenge is particularly pronounced in wireless and digitalized energy networks, where users' requirements have largely not been taken into consideration due to the lack of a…
Read More
Securing the Future: How AI Gateways Protect AI Agent Systems in the Era of Generative AI

Securing the Future: How AI Gateways Protect AI Agent Systems in the Era of Generative AI

The Future: From Rules Engines to Instruction-Following AI Agent SystemsIn sectors such as banking and insurance, rules engines have long played a critical role in decision-making. Whether determining eligibility for opening a bank account or approving an insurance claim, these engines apply predefined rules to process data and make automated decisions. When these systems fail, human subject matter experts (SMEs) step in to handle exceptions. However, the emergence of instruction-following GenAI models is set to change the game. Instead of relying on static rules engines, these models can be trained on specific rule datasets to make complex decisions dynamically. For example,…
Read More
Sparse Bayesian Generative Modeling for Compressive Sensing

Sparse Bayesian Generative Modeling for Compressive Sensing

arXiv:2411.09483v1 Announce Type: cross Abstract: This work addresses the fundamental linear inverse problem in compressive sensing (CS) by introducing a new type of regularizing generative prior. Our proposed method utilizes ideas from classical dictionary-based CS and, in particular, sparse Bayesian learning (SBL), to integrate a strong regularization towards sparse solutions. At the same time, by leveraging the notion of conditional Gaussianity, it also incorporates the adaptability from generative models to training data. However, unlike most state-of-the-art generative models, it is able to learn from a few compressed and noisy data samples and requires no optimization algorithm for solving the inverse…
Read More
Beyond the Doors of Perception: Vision Transformers Represent Relations Between Objects

Beyond the Doors of Perception: Vision Transformers Represent Relations Between Objects

[Submitted on 22 Jun 2024 (v1), last revised 13 Nov 2024 (this version, v2)] View a PDF of the paper titled Beyond the Doors of Perception: Vision Transformers Represent Relations Between Objects, by Michael A. Lepori and 5 other authors View PDF HTML (experimental) Abstract:Though vision transformers (ViTs) have achieved state-of-the-art performance in a variety of settings, they exhibit surprising failures when performing tasks involving visual relations. This begs the question: how do ViTs attempt to perform tasks that require computing visual relations between objects? Prior efforts to interpret ViTs tend to focus on characterizing relevant low-level visual features. In…
Read More
Exploring the Potential of Multimodal LLM with Knowledge-Intensive Multimodal ASR

Exploring the Potential of Multimodal LLM with Knowledge-Intensive Multimodal ASR

[Submitted on 16 Jun 2024 (v1), last revised 14 Nov 2024 (this version, v2)] View a PDF of the paper titled Exploring the Potential of Multimodal LLM with Knowledge-Intensive Multimodal ASR, by Minghan Wang and 4 other authors View PDF HTML (experimental) Abstract:Recent advancements in multimodal large language models (MLLMs) have made significant progress in integrating information across various modalities, yet real-world applications in educational and scientific domains remain challenging. This paper introduces the Multimodal Scientific ASR (MS-ASR) task, which focuses on transcribing scientific conference videos by leveraging visual information from slides to enhance the accuracy of technical terminologies. Realized…
Read More
Introducing Structured Outputs for Batch and Agent Workflows

Introducing Structured Outputs for Batch and Agent Workflows

Many AI use cases now depend on transforming unstructured inputs into structured data. Developers are increasingly relying on LLMs to extract structured data from raw documents, build assistants that retrieve data from API sources, and create agents capable of taking action. Each of these use cases requires the model to generate outputs that adhere to a structured format. Today, we’re excited to introduce Structured Outputs on Mosaic AI Model Serving—a unified API for generating JSON objects that can optionally adhere to a provided JSON schema. This new feature supports all types of models, including open LLMs like Llama, fine-tuned models, and…
Read More
Dual-Segment Clustering Strategy for Hierarchical Federated Learning in Heterogeneous Wireless Environments

Dual-Segment Clustering Strategy for Hierarchical Federated Learning in Heterogeneous Wireless Environments

[Submitted on 15 May 2024 (v1), last revised 14 Nov 2024 (this version, v2)] View a PDF of the paper titled Dual-Segment Clustering Strategy for Hierarchical Federated Learning in Heterogeneous Wireless Environments, by Pengcheng Sun and 8 other authors View PDF HTML (experimental) Abstract:Non-independent and identically distributed (Non- IID) data adversely affects federated learning (FL) while heterogeneity in communication quality can undermine the reliability of model parameter transmission, potentially degrading wireless FL convergence. This paper proposes a novel dual-segment clustering (DSC) strategy that jointly addresses communication and data heterogeneity in FL. This is achieved by defining a new signal-to-noise ratio…
Read More
V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection

V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection

[Submitted on 25 Apr 2024 (v1), last revised 14 Nov 2024 (this version, v4)] View a PDF of the paper titled V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection, by Xuanyu Zhang and 6 other authors View PDF HTML (experimental) Abstract:AI-generated video has revolutionized short video production, filmmaking, and personalized media, making video local editing an essential tool. However, this progress also blurs the line between reality and fiction, posing challenges in multimedia forensics. To solve this urgent issue, V2A-Mark is proposed to address the limitations of current video tampering forensics, such as poor generalizability, singular function,…
Read More
Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders

Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders

[Submitted on 12 Nov 2024 (v1), last revised 13 Nov 2024 (this version, v2)] View a PDF of the paper titled Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders, by Xiaofeng Zhu and 1 other authors View PDF HTML (experimental) Abstract:Although people are impressed by the content generation skills of large language models, the use of LLMs, such as ChatGPT, is limited by the domain grounding of the content. The correctness and groundedness of the generated content need to be based on a verified context, such as results from Retrieval-Augmented Generation (RAG). One important issue…
Read More
Building a Modern Clinical Trial Data Intelligence Platform

Building a Modern Clinical Trial Data Intelligence Platform

In an era where data is the lifeblood of medical advancement, the clinical trial industry finds itself at a critical crossroads. The current landscape of clinical data management is fraught with challenges that threaten to stifle innovation and delay life-saving treatments.As we grapple with an unprecedented deluge of information—with a typical Phase III trial now generating a staggering 3.6 million data points, which is three times more than 15 years ago, and more than 4000 new trials authorized each year—our existing data platforms are buckling under the strain. These outdated systems, characterized by data silos, poor integration, and overwhelming complexity,…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.