stp2y

33170 Posts
Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts

Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts

arXiv:2407.09590v1 Announce Type: new Abstract: By increasing model parameters but activating them sparsely when performing a task, the use of Mixture-of-Experts (MoE) architecture significantly improves the performance of Large Language Models (LLMs) without increasing the inference cost. However, the memory consumption due to the growing number of experts presents a challenge to the deployment of these models in many real world settings. Our empirical study reveals that some experts encode redundant knowledge during pre-training. We thus propose a method of grouping and pruning similar experts to improve model's parameter efficiency. We validate the effectiveness of our method by pruning two…
Read More
Comparing Embedded Systems and Desktop Systems

Comparing Embedded Systems and Desktop Systems

Embedded systems and desktop systems, though both integral parts of our modern technological landscape, serve vastly different purposes and operate under distinct principles. This blog post delves into the differences in non-volatile memory usage, overall system design, and the unique advantages of various embedded system architectures. Non-Volatile Memory Differences Non-volatile memory in embedded systems, such as Flash memory, is used to store firmware and application code that must be retained even when the system is powered off. This type of memory is essential for embedded systems due to their specific, constrained environments that require reliability and longevity. In contrast, desktop…
Read More
Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos to Train AI

Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos to Train AI

In response to the suits, defendants such as Meta, OpenAI, and Bloomberg have argued that their actions constitute fair use. A case against EleutherAI, which originally scraped the books and made them public, was voluntarily dismissed by the plaintiffs.Litigation in remaining cases remains in the early stages, leaving the questions surrounding permission and payment unresolved. The Pile has since been removed from its official download site, but it’s still available on file-sharing services.“Technology companies have run roughshod,” said Amy Keller, a consumer protection attorney and partner at the firm DiCello Levitt who has brought lawsuits on behalf of creatives whose…
Read More
Latin American Startup Funding Rebounds After Deep Slide

Latin American Startup Funding Rebounds After Deep Slide

Funding to Latin American startups rose in the second quarter, led by a resurgence in late-stage dealmaking. Altogether, companies in South and Central America raised $791 million in reported seed- through growth-stage financing in Q2 of 2024, per Crunchbase data. That’s a rise of 25% from the prior quarter and 17% from a year ago. Recent gains follow an unusually weak Q1, in which funding to the region hit a multi-year low. And while the latest numbers look stronger, we’re still far below the heights reached during a record-breaking run in 2021, as charted below: Meanwhile, reported deal counts were…
Read More
Windows Updates will be smaller after the release of Windows 11 version 24H2 – gHacks Tech News

Windows Updates will be smaller after the release of Windows 11 version 24H2 – gHacks Tech News

Microsoft announced a fundamental change to Windows Updates. The built-in service delivers updates to millions of Windows devices throughout the world. Starting with the release of this year's feature update, most Windows updates will become considerably smaller. To better understand the change, it is important to look at the current situation. Cumulative updates for Windows 11 include all the changes since the last RTM release of the operating system. This means that they contain fixes and changes that may have been installed already on a device. Good to know: the planned change applies to Windows 11 version 24H2 and Windows…
Read More
Seq-to-Final: A Benchmark for Tuning from Sequential Distributions to a Final Time Point

Seq-to-Final: A Benchmark for Tuning from Sequential Distributions to a Final Time Point

arXiv:2407.09642v1 Announce Type: new Abstract: Distribution shift over time occurs in many settings. Leveraging historical data is necessary to learn a model for the last time point when limited data is available in the final period, yet few methods have been developed specifically for this purpose. In this work, we construct a benchmark with different sequences of synthetic shifts to evaluate the effectiveness of 3 classes of methods that 1) learn from all data without adapting to the final period, 2) learn from historical data with no regard to the sequential nature and then adapt to the final period, and…
Read More

Fujitsu, Cohere Launch Enterprise-Focused Generative AI Partnership

The companies will develop innovative Japanese LLM for private cloud usage through Fujitsu Kozuchi AI services Today, Fujitsu announced a strategic partnership with Cohere Inc., a security and data privacy-focused enterprise AI company headquartered in Toronto and San Francisco. The strategic partnership will focus on developing and providing large language model (LLM) that will enable enterprises to leverage industry-leading Japanese language capabilities that deliver improved experiences for customers and employees. In addition, Fujitsu has made a significant investment and entered into a strategic partnership between the two companies. As part of the partnership, Fujitsu will become the exclusive provider of jointly developed services on the…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.