AWS Bolsters GenAI Capabilities in SageMaker, Bedrock

AWS unveiled a slew of new updates to its AI tools during its re:Invent conference today, including enhancements to its SageMaker HyperPod AI model training environment, as well as to Bedrock, its environment for building generative AI applications using foundation models.

The GenAI revolution officially entered its third year during re:Invent 2024, which has spread 65,000 AWS customers, vendors, and press across much of the Las Vegas Strip. OpenAI ignited the GenAI firestorm with the launch of ChatGPT on November 30, 2022, and it’s been raging ever since.

AWS has already brought many GenAI capabilities to its cloud, and the rollout continued this week. The company unveiled several enhancements to SageMaker HyperPod, which it first launched a year ago to speed the training of foundation models.

Different AI teams have different training needs. Some teams may need a large amount of accelerated compute for short amount of time, while others may need smaller amounts over a longer period of time. With the new task governance capability unveiled today, AI development teams can create flexible training plans that SageMaker Hyperpod will then execute using EC2 capacity blocks.

The new capability will dynamically allocate workload to enable customers to get more useful work out of their large clusters at certain times, such as when data scientists and AI engineers go to sleep, said Rahul Pathak, VP of data and AI at AWS. “Normally you don’t want these expensive systems sitting idle,” he said during a press briefing at re:Invent Tuesday.

AWS built task governance for itself to improve compute utilization, and decided to make it available to customers, Pathak said. The capability can drive compute utilization up to 90%, he said.

The company also unveiled new “recipes” that help customers get started with training different models, such as Llama or Mistral, faster. AWS now has more than 30 curated model training recipes.

It’s easier to switch SageMaker HyperPod to use different processor types, such as Nvidia GPUs or AWS’s own Trainium chips, thanks to the new flexible training plans that AWS unveiled today.

“In a few clicks, customers can specify their budget, desired completion date, and maximum amount of compute resources they need,” AWS said in a press release. “SageMaker HyperPod then automatically reserves capacity, sets up clusters, and creates model training jobs, saving teams weeks of model training time.”

AWS also made various announcements for Bedrock, the collection of tools it launched in April 2023 for building generative AI applications using its own pre-trained foundation models, suck Titan, as well as third-party models from AI21 Labs, Anthropic, and Stability AI, among others.

Bedrock customers can use the new Nova family of models that AWS announced on Tuesday, including Nova Micro, Nova Lite, Nova Pro, Nova Premier, Nova Canvas, and Amazon Nova Reel. Customers can also use foundation models from Poolside, Stability AI, and Luma AI, and dozens more via Bedrock Marketplace, which AWS also launched today.. AWS says Bedrock Marketplace currently has more than 100 models.

AI prompts can be repetitive. To help save customers money when submitting the same prompt over and over, AWS unveiled a new Bedrock feature called prompt caching. According to Pathak, by automatically caching repetitive prompts, AWS can not only reduce costs by up to 90% for Bedrock users, but it can reduce latency by up to 85%.

AI models can be unpredictable; that’s the nature of probabilistic systems. To prevent some of the worst behaviors, AWS has supported guardrails on Bedrock, but only for language models. Today, it updated the guardrails to support multi-modal toxicity detection in images generated with Bedrock foundation models.

Bedrock Data Automation (BDA) is another capability unveiled today that allows Bedrock Knowledge Base to support unstructured data, such as documents, images, and data held in tables, into their GenAI apps. The new Bedrock feature should make it easier for developers to build intelligent document processing, media analysis, and other multimodal data-centric automation solutions, AWS said.

Graphs provide quick access to related data (source: NetworkX.org)

“Getting that data into a form that it can be used … isn’t straightforward,” Pathak said. Bedrock Data Automation essentially is “LLM powered ETL for unstructured data,” he added. “It’s really sophisticated and gives customers the ability to unlock the data for inference with a single API.”

BDA is integrated with Bedrock Knowledge Bases, which should make it easier to incorporate the information from the multi-modal content for GenAI apps using retrieval-augmented generation (RAG) techniques.

AI is primarily based on unstructured data, such as text and images. But customers have a ton of structured data stored in business applications, as well as data warehouses, data lakes, and lakehouses. To help customers utilize that information in their GenAI apps, AWS announced support for structured (or multi-modal) data in Bedrock Knowledge Base.

AWS also announced GraphRAG support in Bedrock Knowledge Base. GraphRAG is an increasingly popular approach to developing GenAI apps that utilizes a graph database to find the most contextually relevant data and feed it into a RAG workflow. AWS says GraphRAG helps to improve the quality of the output and reduce hallucinations even more than RAG by itself.

AWS Unveils Hosted Apache Iceberg Service on S3, New Metadata Management Layer

New AWS Service Lets Businesses Upload Data to Cloud From Secure Terminals

Source link
lol