This repository was archived by the owner on Nov 20, 2025. It is now read-only.

aws/sagemaker-inference-toolkitPublic archive

NotificationsYou must be signed in to change notification settings
Fork80
Star409

Serve machine learning models within a 🐳 Docker container using 🧠 Amazon SageMaker.

License

Apache-2.0 license

409 stars 80 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
.github		.github
branding/icon		branding/icon
src/sagemaker_inference		src/sagemaker_inference
test		test
.coveragerc_py310		.coveragerc_py310
.coveragerc_py38		.coveragerc_py38
.coveragerc_py39		.coveragerc_py39
.flake8		.flake8
.gitignore		.gitignore
.pylintrc		.pylintrc
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
NOTICE		NOTICE
README.md		README.md
VERSION		VERSION
buildspec-deploy.yml		buildspec-deploy.yml
buildspec-release.yml		buildspec-release.yml
buildspec.yml		buildspec.yml
codecov.yml		codecov.yml
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Repository files navigation

SageMaker Inference Toolkit

Serve machine learning models within a Docker container using AmazonSageMaker.

📚 Background

Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows.You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models.

Once you have a trained model, you can include it in aDocker container that runs your inference code.A container provides an effectively isolated environment, ensuring a consistent runtime regardless of where the container is deployed.Containerizing your model and code enables fast and reliable deployment of your model.

TheSageMaker Inference Toolkit implements a model serving stack and can be easily added to any Docker container, making itdeployable to SageMaker.This library's serving stack is built onMulti Model Server, and it can serve your own models or those you trained on SageMaker usingmachine learning frameworks with native SageMaker support.If you use aprebuilt SageMaker Docker image for inference, this library may already be included.

For more information, see the Amazon SageMaker Developer Guide sections onbuilding your own container with Multi Model Server andusing your own models.

🛠️ Installation

To install this library in your Docker image, add the following line to yourDockerfile:

RUN pip3 install multi-model-server sagemaker-inference

Here is an example of a Dockerfile that installs SageMaker Inference Toolkit.

💻 Usage

Implementation Steps

To use the SageMaker Inference Toolkit, you need to do the following:

Implement an inference handler, which is responsible for loading the model and providing input, predict, and output functions.(Here is an example of an inference handler.)

fromsagemaker_inferenceimportcontent_types,decoder,default_inference_handler,encoder,errorsclassDefaultPytorchInferenceHandler(default_inference_handler.DefaultInferenceHandler):defdefault_model_fn(self,model_dir,context=None):"""Loads a model. For PyTorch, a default function to load a model cannot be provided.        Users should provide customized model_fn() in script.        Args:            model_dir: a directory where model is saved.            context (obj): the request context (default: None).        Returns: A PyTorch model.        """raiseNotImplementedError(textwrap.dedent("""        Please provide a model_fn implementation.        See documentation for model_fn at https://github.com/aws/sagemaker-python-sdk        """))defdefault_input_fn(self,input_data,content_type,context=None):"""A default input_fn that can handle JSON, CSV and NPZ formats.        Args:            input_data: the request payload serialized in the content_type format            content_type: the request content_type            context (obj): the request context (default: None).        Returns: input_data deserialized into torch.FloatTensor or torch.cuda.FloatTensor depending if cuda is available.        """returndecoder.decode(input_data,content_type)defdefault_predict_fn(self,data,model,context=None):"""A default predict_fn for PyTorch. Calls a model on data deserialized in input_fn.        Runs prediction on GPU if cuda is available.        Args:            data: input data (torch.Tensor) for prediction deserialized by input_fn            model: PyTorch model loaded in memory by model_fn            context (obj): the request context (default: None).        Returns: a prediction        """returnmodel(input_data)defdefault_output_fn(self,prediction,accept,context=None):"""A default output_fn for PyTorch. Serializes predictions from predict_fn to JSON, CSV or NPY format.        Args:            prediction: a prediction result from predict_fn            accept: type which the output data needs to be serialized            context (obj): the request context (default: None).        Returns: output data serialized        """returnencoder.encode(prediction,accept)

Note, passing context as an argument to the handler functions is optional. Customer can choose to omit context from the function declaration if it's not needed in the runtime. For example, the following handler function declarations will also work:

def default_model_fn(self, model_dir)def default_input_fn(self, input_data, content_type)def default_predict_fn(self, data, model)def default_output_fn(self, prediction, accept)

Implement a handler service that is executed by the model server.(Here is an example of a handler service.)For more information on how to define yourHANDLER_SERVICE file, seethe MMS custom service documentation.

fromsagemaker_inference.default_handler_serviceimportDefaultHandlerServicefromsagemaker_inference.transformerimportTransformerfromsagemaker_pytorch_serving_container.default_inference_handlerimportDefaultPytorchInferenceHandlerclassHandlerService(DefaultHandlerService):"""Handler service that is executed by the model server.    Determines specific default inference handlers to use based on model being used.    This class extends ``DefaultHandlerService``, which define the following:        - The ``handle`` method is invoked for all incoming inference requests to the model server.        - The ``initialize`` method is invoked at model server start up.    Based on: https://github.com/awslabs/multi-model-server/blob/master/docs/custom_service.md    """def__init__(self):transformer=Transformer(default_inference_handler=DefaultPytorchInferenceHandler())super(HandlerService,self).__init__(transformer=transformer)

Implement a serving entrypoint, which starts the model server.(Here is an example of a serving entrypoint.)
```
fromsagemaker_inferenceimportmodel_servermodel_server.start_model_server(handler_service=HANDLER_SERVICE)
```
Define the location of the entrypoint in your Dockerfile.
```
ENTRYPOINT ["python","/usr/local/bin/entrypoint.py"]
```

Complete Example

Here is a complete example demonstrating usage of the SageMaker Inference Toolkit in your own container for deployment to a multi-model endpoint.

📜 License

This library is licensed under theApache 2.0 License.For more details, please take a look at theLICENSE file.

🤝 Contributing

Contributions are welcome!Please read ourcontributing guidelinesif you'd like to open an issue or submit a pull request.

About

Serve machine learning models within a 🐳 Docker container using 🧠 Amazon SageMaker.

Code of conduct

Contributing

Security policy

Activity

Custom properties

Stars

409 stars

Watchers

48 watching

Forks

80 forks

Report repository

Releases48

v1.10.1 Latest

Oct 25, 2023

+ 47 releases

Packages

No packages published

Contributors24

+ 10 contributors

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

SageMaker Inference Toolkit

📚 Background

🛠️ Installation

💻 Usage

Implementation Steps

Complete Example

📜 License

🤝 Contributing

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases48

Packages

Uh oh!

Contributors24

Uh oh!

Languages

Movatterモバイル変換

License

aws/sagemaker-inference-toolkit

Folders and files

Latest commit

History

Repository files navigation

SageMaker Inference Toolkit

📚 Background

🛠️ Installation

💻 Usage

Implementation Steps

Complete Example

📜 License

🤝 Contributing

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases48

Packages0

Uh oh!

Contributors24

Uh oh!

Languages

Packages