CybAI news - - Page 15

23 Nov

Decompose and Leverage Preferences from Expert Models for Improving Trustworthiness of MLLMs

stp2y0 CommentsViral News

arXiv:2411.13697v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) can enhance trustworthiness by aligning with human preferences. As human preference labeling is laborious, recent works employ evaluation models for assessing MLLMs' responses, using the model-based assessments to automate preference dataset construction. This approach, however, faces challenges with MLLMs' lengthy and compositional responses, which often require diverse reasoning skills that a single evaluation model may not fully possess. Additionally, most existing methods rely on closed-source models as evaluators. To address limitations, we propose DecompGen, a decomposable framework that uses an ensemble of open-sourced expert models. DecompGen breaks down each response…

23 Nov

Natural Language Reinforcement Learning

stp2y0 CommentsViral News

arXiv:2411.14251v1 Announce Type: cross Abstract: Reinforcement Learning (RL) mathematically formulates decision-making with Markov Decision Process (MDP). With MDPs, researchers have achieved remarkable breakthroughs across various domains, including games, robotics, and language models. This paper seeks a new possibility, Natural Language Reinforcement Learning (NLRL), by extending traditional MDP to natural language-based representation space. Specifically, NLRL innovatively redefines RL principles, including task objectives, policy, value function, Bellman equation, and policy iteration, into their language counterparts. With recent advancements in large language models (LLMs), NLRL can be practically implemented to achieve RL-like policy and value improvement by either pure prompting or gradient-based training.…

23 Nov

Threads is testing out advanced search features and AI summaries for trending topics

stp2y0 CommentsNewsgear, meta, news, threads

Threads is making more changes to address long-running complaints from users. This time, the company is testing out improvements to its search and trending topics feature in updates that Adam Mosseri described as “long-overdue improvements.”On search, Threads is testing the ability to search for posts within specific date ranges and account-specific searches. The changes are similar to some of X’s advanced search capabilities and could make it easier for users to look for a specific post they want to revisit. The lack of advanced search on Threads has long been frustrating and up to now, the most reliable way to…

23 Nov

Split Federated Learning Over Heterogeneous Edge Devices: Algorithm and Optimization

stp2y0 CommentsViral News

arXiv:2411.13907v1 Announce Type: new Abstract: Split Learning (SL) is a promising collaborative machine learning approach, enabling resource-constrained devices to train models without sharing raw data, while reducing computational load and preserving privacy simultaneously. However, current SL algorithms face limitations in training efficiency and suffer from prolonged latency, particularly in sequential settings, where the slowest device can bottleneck the entire process due to heterogeneous resources and frequent data exchanges between clients and servers. To address these challenges, we propose the Heterogeneous Split Federated Learning (HSFL) framework, which allows resource-constrained clients to train their personalized client-side models in parallel, utilizing different cut…

23 Nov

VQA$^2$: Visual Question Answering for Video Quality Assessment

stp2y0 CommentsViral News

[Submitted on 6 Nov 2024 (v1), last revised 21 Nov 2024 (this version, v2)] View a PDF of the paper titled VQA$^2$: Visual Question Answering for Video Quality Assessment, by Ziheng Jia and 9 other authors View PDF HTML (experimental) Abstract:The advent and proliferation of large multi-modal models (LMMs) have introduced new paradigms to computer vision, transforming various tasks into a unified visual question answering framework. Video Quality Assessment (VQA), a classic field in low-level visual perception, focused initially on quantitative video quality scoring. However, driven by advances in LMMs, it is now progressing toward more holistic visual quality understanding…

23 Nov

I wanted to have kids but my wife didn’t. I chose my relationship over becoming a mom.

stp2y0 CommentsRAG models

When we started dating eight years ago, my now-wife said she didn't want kids. I wanted to have kids, but I chose to put my focus on our relationship and it paid off. We enjoy our childfree marriage and have dogs that we put sweaters on. When starting a relationship, the topic of kids will come up at some point. And in certain situations, one person isn't interested in bringing children into their lives. That is exactly what happened with my wife and I. Eight years ago we started our journey together and the inevitable conversation made its way into…

23 Nov

Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

stp2y0 CommentsViral News

arXiv:2411.14257v1 Announce Type: new Abstract: Hallucinations in large language models are a widespread problem, yet the mechanisms behind whether models will hallucinate are poorly understood, limiting our ability to solve this problem. Using sparse autoencoders as an interpretability tool, we discover that a key part of these mechanisms is entity recognition, where the model detects if an entity is one it can recall facts about. Sparse autoencoders uncover meaningful directions in the representation space, these detect whether the model recognizes an entity, e.g. detecting it doesn't know about an athlete or a movie. This suggests that models can have self-knowledge:…

23 Nov

Amazon doubles down on Anthropic, positioning itself as a key player in the AI arms race

stp2y0 CommentsChat-GPT

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The artificial intelligence arms race heated up Friday as Amazon announced an additional $4 billion investment in Anthropic, doubling its stake to $8 billion in a move that signals the cloud computing giant’s ambitious bid to compete with Microsoft and Google in the fast-evolving AI landscape. The deal, which maintains Amazon as a minority investor, establishes AWS as Anthropic’s primary cloud and training partner. Most significantly, it commits Anthropic to using Amazon’s custom-designed Trainium and Inferentia chips for training and deploying…

23 Nov

Schemato — An LLM for Netlist-to-Schematic Conversion

stp2y0 CommentsViral News

arXiv:2411.13899v1 Announce Type: new Abstract: Machine learning models are advancing circuit design, particularly in analog circuits. They typically generate netlists that lack human interpretability. This is a problem as human designers heavily rely on the interpretability of circuit diagrams or schematics to intuitively understand, troubleshoot, and develop designs. Hence, to integrate domain knowledge effectively, it is crucial to translate ML-generated netlists into interpretable schematics quickly and accurately. We propose Schemato, a large language model (LLM) for netlist-to-schematic conversion. In particular, we consider our approach in the two settings of converting netlists to .asc files for LTSpice and LATEX files for…

23 Nov

Extending Video Masked Autoencoders to 128 frames

stp2y0 CommentsViral News

[Submitted on 20 Nov 2024] Authors:Nitesh Bharadwaj Gundavarapu, Luke Friedman, Raghav Goyal, Chaitra Hegde, Eirikur Agustsson, Sagar M. Waghmare, Mikhail Sirotenko, Ming-Hsuan Yang, Tobias Weyand, Boqing Gong, Leonid Sigal View a PDF of the paper titled Extending Video Masked Autoencoders to 128 frames, by Nitesh Bharadwaj Gundavarapu and 10 other authors View PDF HTML (experimental) Abstract:Video understanding has witnessed significant progress with recent video foundation models demonstrating strong performance owing to self-supervised pre-training objectives; Masked Autoencoders (MAE) being the design of choice. Nevertheless, the majority of prior works that leverage MAE pre-training have focused on relatively short video representations (16…