stp2y

32916 Posts
AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference

AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference

arXiv:2408.10284v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models are designed to enhance the efficiency of large language models (LLMs) without proportionally increasing the computational demands. However, their deployment on edge devices still faces significant challenges due to high on-demand loading overheads from managing sparsely activated experts. This paper introduces AdapMoE, an algorithm-system co-design framework for efficient MoE inference. AdapMoE features adaptive expert gating and management to reduce the on-demand loading overheads. We observe the heterogeneity of experts loading across layers and tokens, based on which we propose a sensitivity-based strategy to adjust the number of activated experts dynamically. Meanwhile, we…
Read More
The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks

The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks

arXiv:2408.10446v1 Announce Type: new Abstract: The rapid advancement of text-to-image generation systems, exemplified by models like Stable Diffusion, Midjourney, Imagen, and DALL-E, has heightened concerns about their potential misuse. In response, companies like Meta and Google have intensified their efforts to implement watermarking techniques on AI-generated images to curb the circulation of potentially misleading visuals. However, in this paper, we argue that current image watermarking methods are fragile and susceptible to being circumvented through visual paraphrase attacks. The proposed visual paraphraser operates in two steps. First, it generates a caption for the given image using KOSMOS-2, one of the latest…
Read More
Putting People in LLMs’ Shoes: Generating Better Answers via Question Rewriter

Putting People in LLMs’ Shoes: Generating Better Answers via Question Rewriter

arXiv:2408.10573v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated significant capabilities, particularly in the domain of question answering (QA). However, their effectiveness in QA is often undermined by the vagueness of user questions. To address this issue, we introduce single-round instance-level prompt optimization, referred to as question rewriter. By enhancing the intelligibility of human questions for black-box LLMs, our question rewriter improves the quality of generated answers. The rewriter is optimized using direct preference optimization based on feedback collected from automatic criteria for evaluating generated answers; therefore, its training does not require costly human annotations. The experiments across…
Read More
Fine-tune Meta Llama 3.1 models for generative AI inference using Amazon SageMaker JumpStart | Amazon Web Services

Fine-tune Meta Llama 3.1 models for generative AI inference using Amazon SageMaker JumpStart | Amazon Web Services

Fine-tuning Meta Llama 3.1 models with Amazon SageMaker JumpStart enables developers to customize these publicly available foundation models (FMs). The Meta Llama 3.1 collection represents a significant advancement in the field of generative artificial intelligence (AI), offering a range of capabilities to create innovative applications. The Meta Llama 3.1 models come in various sizes, with 8 billion, 70 billion, and 405 billion parameters, catering to diverse project needs. What makes these models stand out is their ability to understand and generate text with impressive coherence and nuance. Supported by context lengths of up to 128,000 tokens, the Meta Llama 3.1…
Read More
Jamie Dimon’s name keeps being floated for a spot in the White House — by both parties

Jamie Dimon’s name keeps being floated for a spot in the White House — by both parties

JPMorgan Chase CEO Jamie Dimon's name has been floated for a Cabinet role in the White House by former President Donald Trump — and now, according to a report, folks in Vice President Kamala Harris' orbit.In June, Donald Trump told Bloomberg in an interview that he would consider Dimon for a potential role as Treasury Secretary.Dimon also appears to be an option for Harris' cabinet.A source familiar with the matter told CNBC that among names floated, Harris' orbit has mentioned Dimon for the same role in conversations that took place during the Democratic National Convention this week in Chicago.A spokesperson…
Read More
RFK Jr. Says He Constantly Gets Fooled by Fake AI Content

RFK Jr. Says He Constantly Gets Fooled by Fake AI Content

AI misinformation, meet your target audience.Brain WormsConspiracy theorist and avowed brain worm haver Robert F. Kennedy Jr. says that he gets fooled by fake AI content "all the time," as quoted by Mother Jones. We believe him!Per Mother Jones, Kennedy — who, yes, is still running as a third-party candidate in the 2024 race — made the statements last week during a Zoom-powered campaign event titled "The AI Revolution: Live Conversation on the Perils and Promise of AI."During the panel — which was moderated by a guy who has falsely equated the COVID-19 vaccine with "gene therapy" and, as Mother Jones…
Read More
A new AI support chatbot is available for hacked YouTube channels

A new AI support chatbot is available for hacked YouTube channels

YouTube added a new AI assistant feature that allows users who have been hacked to recover their accounts and safeguard them from future invasions. An announcement for the new help feature appeared earlier today on Google’s support page for YouTube.The new “hacked channel assistant,” available on YouTube, will allow “eligible creators” a way to troubleshoot their accounts when they’ve been hacked. The feature can be accessed in the YouTube Help Center.The assistant will ask a series of questions to help affected users secure their Google login, undo anything the hacker may have done to their channel and secure their channel…
Read More
Putin ally’s machine-gun Cybertruck may look ‘super cool,’ but it’s ‘useless’ on the battlefield, military expert says

Putin ally’s machine-gun Cybertruck may look ‘super cool,’ but it’s ‘useless’ on the battlefield, military expert says

Chechen warlord Ramzan Kadyrov sparked social media buzz this week after posting a video of himself driving a Tesla Cybertruck affixed with what appeared to be a machine gun on top.But while the decked-out electric vehicle might look like something out of "Star Wars" — and may boast some features that could be useful in rugged terrain — the futuristic EV is likely to prove effectively useless on the actual battlefield, a military expert told Business Insider."Where do you recharge this thing on the battlefield? There are no Tesla outlets on the front lines in the Donbas," said Mark Cancian,…
Read More
NoRA: Nested Low-Rank Adaptation for Efficient Fine-Tuning Large Models

NoRA: Nested Low-Rank Adaptation for Efficient Fine-Tuning Large Models

arXiv:2408.10280v1 Announce Type: new Abstract: In this paper, we introduce Nested Low-Rank Adaptation (NoRA), a novel approach to parameter-efficient fine-tuning that extends the capabilities of Low-Rank Adaptation (LoRA) techniques. Vanilla LoRA overlooks pre-trained weight inheritance and still requires fine-tuning numerous parameters. To addresses these issues, our NoRA adopts a dual-layer nested structure with Singular Value Decomposition (SVD), effectively leveraging original matrix knowledge while reducing tunable parameters. Specifically, NoRA freezes the outer LoRA weights and utilizes an inner LoRA design, providing enhanced control over model optimization. This approach allows the model to more precisely adapt to specific tasks while maintaining a…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.