Introducing SageMaker Core: A new object-oriented Python SDK for Amazon SageMaker

We’re excited to announce the release of SageMaker Core, a new Python SDK from Amazon SageMaker designed to offer an object-oriented approach for managing the machine learning (ML) lifecycle. This new SDK streamlines data processing, training, and inference and features resource chaining, intelligent defaults, and enhanced logging capabilities. With SageMaker Core, managing ML workloads on SageMaker becomes simpler and more efficient. The SageMaker Core SDK comes bundled as part of the SageMaker Python SDK version 2.231.0 and above.

In this post, we show how the SageMaker Core SDK simplifies the developer experience while providing API for seamlessly executing various steps in a general ML lifecycle. We also discuss the main benefits of using this SDK along with sharing relevant resources to learn more about this SDK.

Traditionally, developers have had two options when working with SageMaker: the AWS SDK for Python, also known as boto3, or the SageMaker Python SDK. Although both provide comprehensive APIs for ML lifecycle management, they often rely on loosely typed constructs such as hard-coded constants and JSON dictionaries, mimicking a REST interface. For instance, to create a training job, Boto3 offers a create_training_job API, but retrieving job details requires the describe_training_job API.

While using boto3, developers face the challenge of remembering and crafting lengthy JSON dictionaries, ensuring that all keys are accurately placed. Let’s take a closer look at the create_training_job method from boto3:

response = client.create_training_job(
    TrainingJobName="string",
    HyperParameters={
        'string': 'string'
    },
    AlgorithmSpecification={
            .
            .
            .
    },
    RoleArn='string',
    InputDataConfig=[
        {
            .
            .
            .
        },
    ],
    OutputDataConfig={
            .
            .
            .
    },
    ResourceConfig={
            .
            .
            .    
    },
    VpcConfig={
            .
            .
            .
    },
    .
    .
    .
    .
# All arguments/fields are not shown for brevity purposes.

)

If we observe carefully, for arguments such as AlgorithmSpecification, InputDataConfig, OutputDataConfig, ResourceConfig, or VpcConfig, we need to write verbose JSON dictionaries. Because it contains many string variables in a long dictionary field, it’s very easy to have a typo somewhere or a missing key. There is no type checking possible, and as for the compiler, it’s just a string.
Similarly in SageMaker Python SDK, it requires us to create an estimator object and invoke the fit() method on it. Although these constructs work well, they aren’t intuitive to the developer experience. It’s hard for developers to map the meaning of an estimator to something that can be used to train a model.

Introducing SageMaker Core SDK

SageMaker Core SDK offers to solve this problem by replacing such long dictionaries with object-oriented interfaces, so developers can work with object-oriented abstractions, and SageMaker Core will take care of converting those objects to dictionaries and executing the actions on the developer’s behalf.

The following are the key features of SageMaker Core:

Object-oriented interface – It provides object-oriented classes for tasks such as processing, training, or deployment. Providing such interface can enforce strong type checking, make the code more manageable and promote reusability. Developers can benefit from all features of object-oriented programming.
Resource chaining – Developers can seamlessly pass SageMaker resources as objects by supplying them as arguments to different resources. For example, we can create a model object and pass that model object as an argument while setting up the endpoint. In contrast, while using Boto3, we need to supply ModelName as a string argument.
Abstraction of low-level details – It automatically handles resource state transitions and polling logics, freeing developers from managing these intricacies and allowing them to focus on higher value tasks.
Support for intelligent defaults – It supports SageMaker intelligent defaults, allowing developers to set default values for parameters such as AWS and Identity and Access Management (IAM) roles and virtual private cloud (VPC) configurations. This streamlines the setup process, and SageMaker Core API will pick the default settings automatically from the environment.
Auto code completion – It enhances the developer experience by offering real-time suggestions and completions in popular integrated development environments (IDEs), reducing chances of syntax errors and speeding up the coding process.
Full parity with SageMaker APIs, including generative AI – It provides access to the SageMaker capabilities, including generative AI, through the core SDK, so developers can seamlessly use SageMaker Core without worrying about feature parity with Boto3.
Comprehensive documentation and type hints – It provides robust and comprehensive documentation and type hints so developers can understand the functionalities of the APIs and objects, write code faster, and reduce errors.

For this walkthrough, we use a straightforward generative AI lifecycle involving data preparation, fine-tuning, and a deployment of Meta’s Llama-3-8B LLM. We use the SageMaker Core SDK to execute all the steps.

Prerequsities

To get started with SageMaker Core, make sure Python 3.8 or greater is installed in the environment. There are two ways to get started with SageMaker Core:

If not using SageMaker Python SDK, install the sagemaker-core SDK using the following code example.
```
%pip install sagemaker-core
```
If you’re already using SageMaker Python SDK, upgrade it to a version greater than or matching version 2.231.0. Any version above 2.231.0 has SageMaker Core preinstalled. The following code example shows the command for upgrading the SageMaker Python SDK.
```
%pip install –upgrade sagemaker>=2.231.0
```

Solution walkthrough

To manage your ML workloads on SageMaker using SageMaker Core, use the steps in the following sections.

Data preparation

In this phase, prepare the training and test data for the LLM. Here, use a publicly available dataset Stanford Question Answering Dataset (SQuAD). The following code creates a ProcessingJob object using the static method create, specifying the script path, instance type, and instance count. Intelligent default settings fetch the SageMaker execution role, which simplifies the developer experience further. You didn’t need to provide the input data location and output data location because that also is supplied through intelligent defaults. For information on how to set up intelligent defaults, check out Configuring and using defaults with the SageMaker Python SDK.

from sagemaker_core.resources import ProcessingJob

# Initialize a ProcessingJob resource
processing_job = ProcessingJob.create(
    processing_job_name="llm-data-prep",
    script_path="s3://my-bucket/data-prep-script.py",
    role_arn=<<Execution Role ARN>>, # Intelligent default for execution role
    instance_type="ml.m5.xlarge",
    instance_count=1
)

# Wait for the ProcessingJob to complete
processing_job.wait()

Training

In this step, you use the pre-trained Llama-3-8B model and fine-tune it on the prepared data from the previous step. The following code snippet shows the training API. You create a TrainingJob object using the create method, specifying the training script, source directory, instance type, instance count, output path, and hyper-parameters.

from sagemaker_core.resources import TrainingJob
from sagemaker_core.shapes import HyperParameters

# Initialize a TrainingJob resource
training_job = TrainingJob.create(
    training_job_name="llm-fine-tune",
    estimator_entry_point="train.py",
    source_dir="s3://my-bucket/training-code",
    instance_type="ml.g5.12xlarge",
    instance_count=1,
    output_path="s3://my-bucket/training-output",
    hyperparameters=HyperParameters(
        learning_rate=0.00001,
        batch_size=8,
        epochs=3
    ),
    role_arn==<<Execution Role ARN>>, # Intelligent default for execution role
    input_data=processing_job.output # Resource chaining
)

# Wait for the TrainingJob to complete
training_job.wait()

For hyperparameters, you create an object, instead of supplying a dictionary. Use resource chaining by passing the output of the ProcessingJob resource as the input data for the TrainingJob.

You also use the intelligent defaults to get the SageMaker execution role. Wait for the training job to finish, and it will produce a model artifact, wrapped in a tar.gz, and store it in the output_path provided in the preceding training API.

Model creation and deployment

Deploying a model on a SageMaker endpoint consists of three steps:

Create a SageMaker model object
Create the endpoint configuration
Create the endpoint

SageMaker Core provides an object-oriented interface for all three steps.

Create a SageMaker model object

The following code snippet shows the model creation experience in SageMaker Core.

from sagemaker_core.shapes import ContainerDefinition
from sagemaker_core.resources import Model

# Create a Model resource
model = Model.create(
    model_name="llm-model",
    primary_container=ContainerDefinition(
        image="763104351884.dkr.ecr.us-west-2.amazonaws.com/djl-inference:0.29.0-tensorrtllm0.11.0-cu124",
        environment={"HF_MODEL_ID": "meta-llama/Meta-Llama-3-8B"}
    ),
    execution_role_arn=<<Execution Role ARN>>, # Intelligent default for execution role
    input_data=training_job.output # Resource chaining
)

Similar to the processing and training steps, you have a create method from model class. The container definition is an object now, specifying the container definition that includes the large model inference (LMI) container image and the HuggingFace model ID. You can also observie resource chaining in action where you pass the output of the TrainingJob as input data to the model.

Create the endpoint configuration

Create the endpoint configuration. The following code snippet shows the experience in SageMaker Core.

from sagemaker_core.shapes import ProductionVariant
from sagemaker_core.resources import Model, EndpointConfig, Endpoint

# Create an EndpointConfig resource
endpoint_config = EndpointConfig.create(
    endpoint_config_name="llm-endpoint-config",
    production_variants=[
        ProductionVariant(
            variant_name="llm-variant",
            initial_instance_count=1,
            instance_type="ml.g5.12xlarge",
            model_name=model
        )
    ]
)

ProductionVariant is an object in itself now.

Create the endpoint

Create the endpoint using the following code snippet.

endpoint = Endpoint.create(
endpoint_name=model_name,
endpoint_config_name=endpoint_config,  # Pass `EndpointConfig` object created above
)

This also uses resource chaining. Instead of supplying just the endpoint_config_name (in Boto3), you pass the whole endpoint_config object.

As we have shown in these steps, SageMaker Core simplifies the development experience by providing an object-oriented interface for interacting with SageMaker resources. The use of intelligent defaults and resource chaining reduces the amount of boilerplate code and manual parameter specification, resulting in more readable and maintainable code.

Cleanup

Any endpoint created using the code in this post will incur charges. Shut down any unused endpoints by using the delete() method.

A note on existing SageMaker Python SDK

SageMaker Python SDK will be using the SageMaker Core as its foundation and will benefit from the object-oriented interfaces created as part of SageMaker Core. Customers can choose to use the object-oriented approach while using the SageMaker Python SDK going forward.

Benefits

The SageMaker Core SDK offers several benefits:

Simplified development – By abstracting low-level details and providing intelligent defaults, developers can focus on building and deploying ML models without getting slowed down by repetitive tasks. It also relieves the developers of the cognitive overload of having to remember long and complex multilevel dictionaries. They can instead work on the object-oriented paradigm that developers are most comfortable with.
Increased productivity – Features like automatic code completion and type hints help developers write code faster and with fewer errors.
Enhanced readability – Dedicated resource classes and resource chaining result in more readable and maintainable code.
Lightweight integration with AWS Lambda – Because this SDK is lightweight (about 8 MB when unzipped), it is straightforward to build an AWS Lambda layer for SageMaker Core and use it for executing various steps in the ML lifecycle through Lambda functions.

Conclusion

SageMaker Core is a powerful addition to Amazon SageMaker, providing a streamlined and efficient development experience for ML practitioners. With its object-oriented interface, resource chaining, and intelligent defaults, SageMaker Core empowers developers to focus on building and deploying ML models without getting slowed down by complex orchestration of JSON structures. Check out the following resources to get started today on SageMaker Core:

About the authors

Vikesh Pandey is a Principal GenAI/ML Specialist Solutions Architect at AWS, helping customers from financial industries design, build and scale their GenAI/ML workloads on AWS. He carries an experience of more than a decade and a half working on entire ML and software engineering stack. Outside of work, Vikesh enjoys trying out different cuisines and playing outdoor sports.

Shweta Singh is a Senior Product Manager in the Amazon SageMaker Machine Learning (ML) platform team at AWS, leading SageMaker Python SDK. She has worked in several product roles in Amazon for over 5 years. She has a Bachelor of Science degree in Computer Engineering and Masters of Science in Financial Engineering, both from New York University.