AWS News Blog
Announcing TorchServe, An Open Source Model Server for PyTorch
PyTorch is one of the most popular open source libraries for deep learning. Developers and researchers particularly enjoy the flexibility it gives them in building and training models. Yet, this is only half the story, and deploying and managing models in production is often the most difficult part of the machine learning process: building bespoke prediction APIs, scaling them, securing them, etc.
One way to simplify the model deployment process is to use a model server, i.e. an off-the-shelf web application specially designed to serve machine learning predictions in production. Model servers make it easy to load one or several models, automatically creating a prediction API backed by a scalable web server. They’re also able to run preprocessing and postprocessing code on prediction requests. Last but not least, model servers also provide production-critical features like logging, monitoring, and security. Popular model servers includeTensorFlow Serving and theMulti Model Server.
Today, I’m extremely happy to announceTorchServe, a PyTorch model serving library that makes it easy to deploy trained PyTorch models at scale without having to write custom code.
Introducing TorchServe
TorchServe is a collaboration between AWS and Facebook, and it’s available as part of thePyTorch open source project. If you’re interested in how the project was initiated, you can read the initialRFC on Github.
WithTorchServe, PyTorch users can now bring their models to production quicker, without having to write custom code: on top of providing a low latency prediction API,TorchServe embeds default handlers for the most common applications such as object detection and text classification. In addition,TorchServe includes multi-model serving, model versioning for A/B testing, monitoring metrics, and RESTful endpoints for application integration. As you would expect,TorchServe supports any machine learning environment, includingAmazon SageMaker, container services, andAmazon Elastic Compute Cloud (Amazon EC2).
Several customers are already enjoying the benefits ofTorchServe.
Toyota Research Institute Advanced Development, Inc. (TRI-AD) is developing software for automated driving atToyota Motor Corporation. Says Yusuke Yachide, Lead of ML Tools at TRI-AD: “we continuously optimize and improve our computer vision models, which are critical to TRI-AD’s mission of achieving safe mobility for all with autonomous driving. Our models are trained with PyTorch on AWS, but until now PyTorch lacked a model serving framework. As a result, we spent significant engineering effort in creating and maintaining software for deploying PyTorch models to our fleet of vehicles and cloud servers. WithTorchServe, we now have a performant and lightweight model server that is officially supported and maintained by AWS and the PyTorch community”.
Matroid is a maker of computer vision software that detects objects and events in video footage. Says Reza Zadeh, Founder and CEO at Matroid Inc.: “we develop a rapidly growing number of machine learning models using PyTorch on AWS and on-premise environments. The models are deployed using a custom model server that requires converting the models to a different format, which is time-consuming and burdensome.TorchServe allows us to simplify model deployment using a single servable file that also serves as the single source of truth, and is easy to share and manage”.
Now, I’d like to show you how to installTorchServe, and load a pretrained model onAmazon Elastic Compute Cloud (Amazon EC2). You can try other environments by following thedocumentation.
Installing TorchServe
First, I fire up a CPU-basedAmazon Elastic Compute Cloud (Amazon EC2) instance running theDeep Learning AMI (Ubuntu edition). This AMI comes preinstalled with several dependencies that I’ll need, which will speed up setup. Of course you could use any AMI instead.
TorchServe is implemented in Java, and I need the latest OpenJDK to run it.
sudo apt install openjdk-11-jdk
Next, I create and activate a newConda environment forTorchServe. This will keep my Python packages nice and tidy (virtualenv works too, of course).
conda create -n torchserve
source activate torchserve
Next, I install dependencies forTorchServe.
pip install sentencepiece # not available as a Conda package
conda install psutil pytorch torchvision torchtext -c pytorch
If you’re using a GPU instance, you’ll need an extra package.
conda install cudatoolkit=10.1
Now that dependencies are installed, I can clone theTorchServe repository, and installTorchServe.
git clone https://github.com/pytorch/serve.git
cd serve
pip install .
cd model-archiver
pip install .
Setup is complete, let’s deploy a model!
Deploying a Model
For the sake of this demo, I’ll simply download a pretrained model from the PyTorchmodel zoo. In real life, you would probably use your own model.
wget https://download.pytorch.org/models/densenet161-8d451a50.pth
Next, I need to package the model into a model archive. A model archive is a ZIP file storing all model artefacts, i.e. the model itself (densenet161-8d451a50.pth), a Python script to load the state dictionary (matching tensors to layers), and any extra file you may need. Here, I include a file namedindex_to_name.json, which maps class identifiers to class names. This will be used by the built-inimage_classifier handler, which is in charge of the prediction logic. Other built-in handlers are available (object_detector,text_classifier,image_segmenter), and you can implement your own.
torch-model-archiver --model-name densenet161 --version 1.0 \
--model-file examples/image_classifier/densenet_161/model.py \
--serialized-file densenet161-8d451a50.pth \
--extra-files examples/image_classifier/index_to_name.json \
--handler image_classifier
Next, I create a directory to store model archives, and I move the one I just created there.
mkdir model_store
mv densenet161.mar model_store/
Now, I can startTorchServe, pointing it at the model store and at the model I want to load. Of course, I could load several models if needed.
torchserve --start --model-store model_store --models densenet161=densenet161.mar
Still on the same machine, I grab an image and easily send it toTorchServe for local serving using an HTTP POST request. Note the format of the URL, which includes the name of the model I want to use.
curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg
curl -X POST http://127.0.0.1:8080/predictions/densenet161 -T kitten.jpg
The result appears immediately. Note that class names are visible, thanks to the built-in handler.
[{"tiger_cat": 0.4693356156349182},{"tabby": 0.46338796615600586},{"Egyptian_cat": 0.06456131488084793},{"lynx": 0.0012828155886381865},{"plastic_bag": 0.00023323005007114261}]
I then stopTorchServe with the ‘stop‘ command.
torchserve --stop
As you can see, it’s easy to get started withTorchServe using the default configuration. Now let me show you how to set it up for remote serving.
Configuring TorchServe for Remote Serving
Let’s create a configuration file forTorchServe, namedconfig.properties (the default name). This files defines which model to load, and sets up remote serving. Here, I’m binding the server to all public IP addresses, but you can restrict it to a specific address if you want to. As this is running on an EC2 instance, I need to make sure that ports 8080 and 8081 are open in the Security Group.
model_store=model_storeload_models=densenet161.marinference_address=http://0.0.0.0:8080management_address=http://0.0.0.0:8081
Now I can startTorchServe in the same directory, without having to pass any command line arguments.
torchserve --start
Moving back to my local machine, I can now invokeTorchServe remotely, and get the same result.
curl -X POST http://ec2-54-85-61-250.compute-1.amazonaws.com:8080/predictions/densenet161 -T kitten.jpg
You probably noticed that I used HTTP. I’m guessing a lot of you will require HTTPS in production, so let me show you how to set it up.
Configuring TorchServe for HTTPS
TorchServe can use either the Java keystore or a certificate. I’ll go with the latter.
First, I create a certificate and a private key withopenssl.
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout mykey.key -out mycert.pem
Then, I update the configuration file to define the location of the certificate and key, and I bindTorchServe to its default secure ports (don’t forget to update the Security Group).
model_store=model_storeload_models=densenet161.marinference_address=https://0.0.0.0:8443management_address=https://0.0.0.0:8444private_key_file=mykey.keycertificate_file=mycert.pem
I restartTorchServe, and I can now invoke it with HTTPS. As I use a self-signed certificate, I need to pass the ‘–insecure’ flag tocurl.
curl --insecure -X POST https://ec2-54-85-61-250.compute-1.amazonaws.com:8443/predictions/densenet161 -T kitten.jpg
There’s a lot more toTorchServe configuration, and I encourage you to read itsdocumentation!
Getting Started
TorchServe is available now athttps://github.com/pytorch/serve.
Give it a try, and please send us feedback onGithub.
- Julien