Using task-specific models from AI21 Labs on AWS | Amazon Web Services

In this blog post, we will show you how to leverage AI21 Labs’ Task-Specific Models (TSMs) on AWS to enhance your business operations. You will learn the steps to subscribe to AI21 Labs in the AWS Marketplace, set up a domain in Amazon SageMaker, and utilize AI21 TSMs via SageMaker JumpStart.

AI21 Labs is a foundation model (FM) provider focusing on building state-of-the-art language models. AI21 Task Specific Models (TSMs) are built for answering questions, summarization, condensing lengthy texts, and so on. AI21 TSMs are available in Amazon SageMaker Jumpstart.

Here are the AI21 TSMs that can be accessed and customized in SageMaker JumpStart: AI21 Contextual Answers, AI21 Summarize, AI21 Paraphrase, and AI21 Grammatical Error Correction.

AI21 FMs (Jamba-Instruct, AI21 Jurassic-2 Ultra, AI21 Jurassic-2 Mid) are available in Amazon Bedrock and can be used for large language model (LLM) use cases. We used AI21 TSMs available in SageMaker Jumpstart for this post. SageMaker Jumpstart enables you to select, compare, and evaluate available AI21 TSMs.

AI21’s TSMs

Foundation models can solve many tasks, but not every task is unique. Some commercial tasks are common across many applications. AI21 Labs’ TSMs are specialized models built to solve a particular problem. They’re built to deliver out-of-box value, cost effectiveness, and higher accuracy for the common tasks behind many commercial use-cases. In this post, we will explore three of AI21 Labs’ TSMs and their unique capabilities.

Foundation models are built and trained on massive datasets to perform a variety of tasks. Unlike FMs, TSMs are trained to perform unique tasks.

When your use case is supported by a TSM, you quickly realize benefits such as improved refusal rates when you don’t want the model to provide answers unless they’re grounded in actual document content.

Paraphrase: This model is used to enhance content creation and communication by generating varied versions of text while maintaining a consistent tone and style. This model is ideal for creating multiple product descriptions, marketing materials, and customer support responses, improving clarity and engagement. It also simplifies complex documents, making information more accessible.
Summarize: This model is used to condense lengthy texts into concise summaries while preserving the original meaning. This model is particularly useful for processing large documents, such as financial reports, legal documents, and technical papers, making critical information more accessible and comprehensible.
Contextual answers: This model is used to significantly enhance information retrieval and customer support processes. This model excels at providing accurate and relevant answers based on specific document contexts, making it particularly useful in customer service, legal, finance, and educational sectors. It streamlines workflows by quickly accessing relevant information from extensive databases, reducing response times and improving customer satisfaction.

Prerequisites

To follow the steps in this post, you must have the following prerequisites in place:

AWS account setup

Completing the labs in this post requires an AWS account and SageMaker environments set up. If you don’t have an AWS account, see Complete your AWS registration for the steps to create one.

AWS Marketplace opt-in

AI21 TSMs can also be accessed through Amazon Marketplace for subscription. Using AWS Marketplace, you can subscribe to AI21 TSMs and deploy SageMaker endpoints.

To do these exercises you must subscribe to the following offerings in the AWS Marketplace

Service quota limits

To use some of the GPU’s required to run AI21’s task specific models, you must have the required service quota limits. You can request a service quota limit increase in the AWS Management Console. Limits are account and resource specific.

To create a service request, search for service quotas in the console search bar. Select the service to land go to the dashboard and enter the name of the GPU (for example, ml.g5.48xlarge). Ensure the quota is for endpoint usage

Estimated cost

The following is the estimated cost to walk through the solution in this post.

Contextual answers:

We used an ml.g5.48xlarge
- By default, AWS accounts don’t have access to this GPU. You must request a service quota limit increase (see the previous section: Service Quota Limits).
The notebook runtime was approximately 15 minutes.
The cost was $20.41 (billed on an hourly basis).

Summarize notebook

We used an ml.g4dn.12xlarge GPU.
- You must request a service quota limit increase (see the previous section: Service Quota Limits).
The notebook runtime was approximately 10 minutes.
The cost was $4.94 (billed on an hourly basis).

Paraphrase notebook

We used the ml.g4dn.12xlarge GPU.
- You must request a service quota limit increase (see the previous section: Service Quota Limits).
The notebook runtime approximately 10 minutes.
The cost was $4.94 (billed on an hourly basis).

Total cost: $30.29 (1 hour charge for each deployed endpoint)

Using AI21 models on AWS

Getting started

In this section, you will access AI21 TSMs in SageMaker Jumpstart. These interactive notebooks contain code to deploy TSM endpoints and will also provide example code blocks to run inference. These first few steps are pre-requisites to deploying the same notebooks. If you already have a SageMaker domain and username set up, you may skip to Step 7.

Use the search bar in the AWS Management Console to navigate to Amazon SageMaker , as shown in the following figure.

If you don’t already have one set up, you must create a SageMaker domain. A domain consists of an associated Amazon Elastic File System (Amazon EFS) volume; a list of authorized users, and a variety of security, application, policy, and Amazon Virtual Private Cloud (Amazon VPC) configurations.

Users within a domain can share notebook files and other artifacts with each other. For more information, see Learn about Amazon SageMaker domain entities and statuses. For today’s exercises, you will use Quick Set-Up to deploy an environment.

Choose Create a SageMaker domain as shown in the following figure.
Select Quick setup. After you choose Set up the domain will begin creation
After a moment, your domain will be created.
Choose Add user.
You can keep the default user profile values.
Launch Studio by choosing Launch button and then selecting Studio.
Choose JumpStart in the navigation pane as shown in the following figure.

Here you can see the model providers for our JumpStart notebooks.

You will see the model providers for JumpStart notebooks.

Select AI21 Labs to see their available models.

Each of AI21’s models has an associated model card. A model card provides key information about the model such as its intended use cases, training, and evaluation details. For this example, you will use the Summarize, Paraphrase, and Contextual Answers TSMs.

Start with Contextual Answers. Select the AI21 Contextual Answers model card.

A sample notebook is included as part of the model. Jupyter Notebooks are a popular way to interact with code and LLMs.

Choose Notebooks to explore the notebook.
To run the notebook’s code blocks, choose Open in JupyterLab.
If you do not already have an existing space, choose Create new space and enter an appropriate name. When ready, choose Create space and open notebook.

It can take up to 5 minutes to open your notebook.
SageMaker Spaces are used to manage the storage and resource needs of some SageMaker Studio applications. Each space has a 1:1 relationship with an instance of an application.

After the notebook opens, you will be prompted to select a kernal. Ensure Python 3 is selected and choose Select.

Navigating the notebook exercises

Repeat the preceding process to import the remaining notebooks.

Each AI21 notebook demonstrates required code imports, version checks, model selection, endpoint creation, and inferences showcasing the TSM’s unique strengths through code blocks and example prompts

Each notebook will have a clean up step at the end to delete your deployed endpoints. It’s important to terminate any running endpoints to avoid additional costs.

Contextual Answers JumpStart Notebook

AWS customers and partners can use AI21 Labs’s Contextual Answers model to significantly enhance their information retrieval and customer support processes. This model excels at providing accurate and relevant answers based on specific context, making it useful in customer service, legal, finance, and educational sectors.

The following are code snippets from AI21’s Contextual Answers TSM through JumpStart. Notice that there is no prompt engineering required. The only input is the question and the context provided.

Input:

financial_context = """In 2020 and 2021, enormous QE — approximately $4.4 trillion, or 18%, of 2021 gross domestic product (GDP) — and enormous fiscal stimulus (which has been and always will be inflationary) — approximately $5 trillion, or 21%, of 2021 GDP — stabilized markets and allowed companies to raise enormous amounts of capital. In addition, this infusion of capital saved many small businesses and put more than $2.5 trillion in the hands of consumers and almost $1 trillion into state and local coffers. These actions led to a rapid decline in unemployment, dropping from 15% to under 4% in 20 months — the magnitude and speed of which were both unprecedented. Additionally, the economy grew 7% in 2021 despite the arrival of the Delta and Omicron variants and the global supply chain shortages, which were largely fueled by the dramatic upswing in consumer spending and the shift in that spend from services to goods. Fortunately, during these two years, vaccines for COVID-19 were also rapidly developed and distributed.
In today's economy, the consumer is in excellent financial shape (on average), with leverage among the lowest on record, excellent mortgage underwriting (even though we've had home price appreciation), plentiful jobs with wage increases and more than $2 trillion in excess savings, mostly due to government stimulus. Most consumers and companies (and states) are still flush with the money generated in 2020 and 2021, with consumer spending over the last several months 12% above pre-COVID-19 levels. (But we must recognize that the account balances in lower-income households, smaller to begin with, are going down faster and that income for those households is not keeping pace with rising inflation.)
Today's economic landscape is completely different from the 2008 financial crisis when the consumer was extraordinarily overleveraged, as was the financial system as a whole — from banks and investment banks to shadow banks, hedge funds, private equity, Fannie Mae and many other entities. In addition, home price appreciation, fed by bad underwriting and leverage in the mortgage system, led to excessive speculation, which was missed by virtually everyone — eventually leading to nearly $1 trillion in actual losses.
"""
question = "Did the economy shrink after the Omicron variant arrived?"
response = client.answer.create(
    context=financial_context,
    question=question,
)

print(response.answer)

Output:

No, the economy did not shrink after the Omicron variant arrived. In fact, the economy grew 7% in 2021, despite the arrival of the Delta and Omicron variants and the global supply chain shortages, which were largely fueled by the dramatic upswing in consumer spending and the shift in that spend from services to goods.

As mentioned in our introduction, AI21’s Contextual Answers model does not provide answers to questions outside of the context provided. If the prompt includes a question unrelated to 2020/2021 economy, you will get a response as shown in the following example.

Input:

irrelevant_question = "How did COVID-19 affect the financial crisis of 2008?"

response = client.answer.create(
context=financial_context,
question=irrelevant_question,
)

print(response.answer)

Output:

None

When finished, you can delete your deployed endpoint by running the final two cells of the notebook.



model.sagemaker_session.delete_endpoint(endpoint_name)
model.sagemaker_session.delete_endpoint_config(endpoint_name)
model.delete_model()

You can import the other notebooks by navigating to SageMaker JumpStart and repeating the same process you used to import this first notebook.
Summarize JumpStart Notebook
AWS customers and partners can uses AI21 Labs’ Summarize model to condense lengthy texts into concise summaries while preserving the original meaning. This model is particularly useful for processing large documents, such as financial reports, legal documents, and technical papers, making critical information more accessible and comprehensible.
The following are highlight code snippets from AI21’s Summarize TSM using JumpStart. Notice that the  input must include the full text that the user wants to summarize.
Input:

text = """The error affected a number of international flights leaving the terminal on Wednesday, with some airlines urging passengers to travel only with hand luggage.
Virgin Atlantic said all airlines flying out of the terminal had been affected.
Passengers have been warned it may be days before they are reunited with luggage.
An airport spokesperson apologised and said the fault had now been fixed.
Virgin Atlantic said it would ensure all bags were sent out as soon as possible.
It added customers should retain receipts for anything they had bought and make a claim to be reimbursed.
Passengers, who were informed by e-mail of the problem, took to social media to vent their frustrations.
One branded the situation "ludicrous" and said he was only told 12 hours before his flight.
The airport said it could not confirm what the problem was, what had caused it or how many people had been affected."""

response = client.summarize.create(
    source=text,
    source_type=DocumentType.TEXT,
)

print("Original text:")
print(text)
print("================")
print("Summary:")
print(response.summary)

Output:


Original text:
The error affected a number of international flights leaving the terminal on Wednesday, with some airlines urging passengers to travel only with hand luggage.
Virgin Atlantic said all airlines flying out of the terminal had been affected.
Passengers have been warned it may be days before they are reunited with luggage.
An airport spokesperson apologised and said the fault had now been fixed.
Virgin Atlantic said it would ensure all bags were sent out as soon as possible.
It added customers should retain receipts for anything they had bought and make a claim to be reimbursed.
Passengers, who were informed by e-mail of the problem, took to social media to vent their frustrations.
One branded the situation "ludicrous" and said he was only told 12 hours before his flight.
The airport said it could not confirm what the problem was, what had caused it or how many people had been affected.
================
Summary:
A number of international flights leaving the terminal were affected by the error on Wednesday, with some airlines urging passengers to travel only with hand luggage. Passengers were warned it may be days before they are reunited with their luggage.


Paraphrase JumpStart Notebook
AWS customers and partners can use AI21 Labs’s Paraphrase TSM through JumpStart to enhance content creation and communication by generating varied versions of text.
The following are highlight code snippets from AI21’s Paraphrase TSM using JumpStart. Notice that there is no extensive prompt engineering required. The only input required is the full text that the user wants to paraphrase and a chosen style, for example casual, formal, and so on.
Input:

text = "Throughout this page, we will explore the advantages and features of the Paraphrase model."

response = client.paraphrase.create(
text=text,
style="formal"

)

print(response.suggestions) Output: 

[Suggestion(text="We will examine the advantages and features of the Paraphrase model throughout this page."), Suggestion(text="The purpose of this page is to examine the advantages and features of the Paraphrase model."), Suggestion(text="On this page, we will discuss the advantages and features of the Paraphrase model."), Suggestion(text="This page will provide an overview of the advantages and features of the Paraphrase model."), Suggestion(text="In this article, we will examine the advantages and features of the Paraphrase model."), Suggestion(text="Here we will explore the advantages and features of the Paraphrase model."), Suggestion(text="The purpose of this page is to describe the advantages and features of the Paraphrase model."), Suggestion(text="In this page, we will examine the advantages and features of the Paraphrase model."), Suggestion(text="The Paraphrase model will be reviewed on this page with an emphasis on its advantages and features."), Suggestion(text="Our goal on this page will be to explore the benefits and features of the Paraphrase model.")]


Input:

print("Original sentence:")
print(text)
print("============================")
print("Suggestions:")
print("n".join(["- " + x.text for x in response.suggestions]))

Output:

Original sentence:
Throughout this page, we will explore the advantages and features of the Paraphrase model.
============================
Suggestions:
- We will examine the advantages and features of the Paraphrase model throughout this page.
- The purpose of this page is to examine the advantages and features of the Paraphrase model.
- On this page, we will discuss the advantages and features of the Paraphrase model.
- This page will provide an overview of the advantages and features of the Paraphrase model.
- In this article, we will examine the advantages and features of the Paraphrase model.
- Here we will explore the advantages and features of the Paraphrase model.
- The purpose of this page is to describe the advantages and features of the Paraphrase model.
- In this page, we will examine the advantages and features of the Paraphrase model.
- The Paraphrase model will be reviewed on this page with an emphasis on its advantages and features.
- Our goal on this page will be to explore the benefits and features of the Paraphrase model.

Less prompt engineering
A key advantage of AI21’s task-specific models is the reduced need for complex prompt engineering compared to foundation models. Let’s consider how you might approach a summarization task using a foundation model compared to using AI21’s specialized Summarize TSM.
For a foundation model, you might need to craft an elaborate prompt template with detailed instructions:

python prompt_template = "You are a highly capable summarization assistant. Concisely summarize the given text while preserving key details and overall meaning. Use clear language tailored for human readers.nnText: 

[INPUT_TEXT]nnSummary:" ``` To summarize text with this foundation model, you'd populate the template and pass the full prompt: ```python input_text = "Insert text to summarize here..." prompt = prompt_template.replace("[INPUT_TEXT]", input_text) summary = model(prompt)

 In contrast, using AI21's Summarize TSM is more straightforward:


python input_text = "Insert text to summarize here..." summary = summarize_model(input_text)

That’s it! With the Summarize TSM, you pass the input text directly to the model; there’s no need for an intricate prompt template.
Lower cost and higher accuracy
By using TSMs, you can achieve lower costs and higher accuracy. As demonstrated previously in the Contextual Notebook, TSMs have a higher refusal rate than most mainstream models, which can lead to higher accuracy. This characteristic of TSMs is beneficial in use cases where wrong answers are less acceptable.
Conclusion
In this post, we explored AI21 Labs’s approach to generative AI using task-specific models (TSMs). Through guided exercises, you walked through the process of setting up a SageMaker domain and importing sample JumpStart Notebooks to experiment with AI21’s TSMs, including Contextual Answers, Paraphrase, and Summarize.
Throughout the exercises, you saw the potential benefits of task-specific models compared to foundation models. When asking questions outside the context of the intended use case, the AI21 TSMs refused to answer, making them less prone to hallucinating or generating nonsensical outputs beyond their intended domain—a critical factor for applications that require precision and safety. Lastly, we highlighted how task-specific models are designed from the outset to excel at specific tasks, streamlining development and reducing the need for extensive prompt engineering and fine-tuning, which can them a more cost-effective solution.
Whether you’re a data scientist, machine learning practitioner, or someone curious about AI advancements, we hope this post has provided valuable insights into the advantages of AI21 Labs’s task-specific approach. As the field of generative AI continues to evolve rapidly, we encourage you to stay curious, experiment with various approaches, and ultimately choose the one that best aligns with your project’s unique requirements and goals. Visit AWS GitHub for other example use cases and codes to experiment in your own environment.
Additional resources

About the Authors
Joe Wilson is a Solutions Architect at Amazon Web Services supporting nonprofit organizations. He has core competencies in data analytics, AI/ML and GenAI. Joe background is in data science and international development. He is passionate about leveraging data and technology for social good.
Pat Wilson is a Solutions Architect at Amazon Web Services with a focus on AI/ML workloads and security. He currently supports Federal Partners. Outside of work Pat enjoys learning, working out, and spending time with family/friends.
Josh Famestad is a Solutions Architect at Amazon Web Services. Josh works with public sector customers to build and execute cloud based approaches to deliver on business priorities.