AI

Efficient and cost-effective multi-tenant LoRA serving with Amazon SageMaker | Amazon Web Services

Efficient and cost-effective multi-tenant LoRA serving with Amazon SageMaker | Amazon Web Services

In the rapidly evolving landscape of artificial intelligence (AI), the rise of generative AI models has ushered in a new era of personalized and intelligent experiences. Organizations are increasingly using the power of these language models to drive innovation and enhance their services, from natural language processing to content generation and beyond. Using generative AI models in the enterprise environment, however, requires taming their intrinsic power and enhancing their skills to address specific customer needs. In cases where an out-of-the-box model is missing knowledge of domain- or organization-specific terminologies, a custom fine-tuned model, also called a domain-specific large language model…
Read More
A snapshot of bias, the human mind, and AI

A snapshot of bias, the human mind, and AI

Introducing bias & the human mind  The human mind is a landscape filled with curiosity and, at times irrationality, motivation, confusion, and bias. The latter results in levels of complexity in how both the human and more recently, the artificial slant affects artificial intelligence systems from concept to scale. Bias is something that in many cases unintentionally appears - whether it be in human decision-making or the dataset - but its impact on output can be sizeable. With several cases over the years highlighting the social, political, technological, and environmental impact of bias, this piece will explore this important topic and…
Read More
Research Papers in January 2024

Research Papers in January 2024

2023 was the year when the potential and complexity of Large Language Models (LLMs) were growing rapidly. Looking at the open source and research advancements in 2024, it seems we are going to a welcome phase of making models better (and smaller) without increasing their size.In this month's article, I am highlighting four recent papers consistent with this theme:1. Weight averaging and model merging allow us to combine multiple LLMs into a single, better one without the typical drawbacks of traditional ensembles, such as increased resource requirements.2. Proxy-tuning to boost the performance of an existing large LLM, using two small…
Read More
Asymmetric Certified Robustness via Feature-Convex Neural Networks

Asymmetric Certified Robustness via Feature-Convex Neural Networks

Asymmetric Certified Robustness via Feature-Convex Neural Networks TLDR: We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios. This focused setting allows us to introduce feature-convex classifiers, which produce closed-form and deterministic certified radii on the order of milliseconds. Figure 1. Illustration of feature-convex classifiers and their certification for sensitive-class inputs. This architecture composes a Lipschitz-continuous feature map $varphi$ with a learned convex function $g$. Since $g$ is convex, it is globally underapproximated by its tangent plane at $varphi(x)$, yielding certified norm balls in the feature space. Lipschitzness of $varphi$…
Read More
Accelerate Mixtral 8x7B pre-training with expert parallelism on Amazon SageMaker | Amazon Web Services

Accelerate Mixtral 8x7B pre-training with expert parallelism on Amazon SageMaker | Amazon Web Services

Mixture of Experts (MoE) architectures for large language models (LLMs) have recently gained popularity due to their ability to increase model capacity and computational efficiency compared to fully dense models. By utilizing sparse expert subnetworks that process different subsets of tokens, MoE models can effectively increase the number of parameters while requiring less computation per token during training and inference. This enables more cost-effective training of larger models within fixed compute budgets compared to dense architectures. Despite their computational benefits, training and fine-tuning large MoE models efficiently presents some challenges. MoE models can struggle with load balancing if the tokens…
Read More
Key role of AI accelerators in economic growth & social development

Key role of AI accelerators in economic growth & social development

In today's rapidly evolving digital landscape, artificial intelligence (AI) stands as a transformative force with the potential to revolutionize industries, spur innovation, and drive economic growth. However, unlocking the full potential of AI requires overcoming significant computational challenges. This is where AI accelerators come into play. AI accelerators, specialized hardware designed to optimize AI workloads, play a crucial role in accelerating AI adoption, powering economic growth, and fostering social development.Acceleration of AI adoptionAI accelerators serve as catalysts for the widespread adoption of AI technologies across various sectors. By enhancing the speed and efficiency of AI computations, these specialized hardware solutions…
Read More
Improving LoRA: Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from Scratch

Improving LoRA: Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from Scratch

Low-rank adaptation (LoRA) is a machine learning technique that modifies a pretrained model (for example, an LLM or vision transformer) to better suit a specific, often smaller, dataset by adjusting only a small, low-rank subset of the model's parameters. This approach is important because it allows for efficient finetuning of large models on task-specific data, significantly reducing the computational cost and time required for finetuning.Last week, researchers proposed DoRA: Weight-Decomposed Low-Rank Adaptation, a new alternative to LoRA, which may outperform LoRA by a large margin.To understand how these methods work, we will implement both LoRA and DoRA in PyTorch from scratch…
Read More
How 20 Minutes empowers journalists and boosts audience engagement with generative AI on Amazon Bedrock | Amazon Web Services

How 20 Minutes empowers journalists and boosts audience engagement with generative AI on Amazon Bedrock | Amazon Web Services

This post is co-written with Aurélien Capdecomme and Bertrand d’Aure from 20 Minutes. With 19 million monthly readers, 20 Minutes is a major player in the French media landscape. The media organization delivers useful, relevant, and accessible information to an audience that consists primarily of young and active urban readers. Every month, nearly 8.3 million 25–49-year-olds choose 20 Minutes to stay informed. Established in 2002, 20 Minutes consistently reaches more than a third (39 percent) of the French population each month through print, web, and mobile platforms. As 20 Minutes’s technology team, we’re responsible for developing and operating the organization’s web and mobile…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.