Today AWS released Model Server for Apache MXNet (MMS) v0.4, which adds support for serving Gluon models. Gluon is an imperative and dynamic interface for MXNet, which enables rapid model development, while maintaining MXNet performance. With this release, MMS adds support for packaging and serving Gluon models at scale. In this blog post, we will describe the v0.4 release in detail and go over an example for serving a Gluon model.

What is Model Server for Apache MXNet (MMS)?

Model Server for Apache MXNet scalable architecture

MMS is an open source model serving framework, designed to simplify the task of serving deep learning models at scale. Here are some key advantages of MMS:

Provides a packaging tool to generate a model archive containing the neural network model artifacts needed to serve MXNet, Gluon, and ONNX neural network models.

Gives you the ability to customize every step in the inference execution pipeline using custom code packaged into the model archive, which enables overriding initialization, pre-processing, and post-processing.

Comes with a preconfigured serving stack, including REST API endpoints, and an inference engine.

Provides prebuilt and optimized container images for scalable model serving.

Includes real-time operational metrics to monitor the server and its endpoints.

What is Gluon?

Gluon is a clear, concise and simple, Python interface to MXNet. It enables engineers to write imperative code to build neural networks, without losing the performance benefits of a symbolic implementation. Gluon is capable of automatically generating optimized symbolic code, based on imperative implementation. Get started with Gluon with our 60-minutes crash course.

Serving a Gluon model in MMS

A step by step example of serving Xiang Zhang’s Character-level Convolutional Neural Network (char-CNN) is available in the GitHub repository and in this AWS AI blog post.

Illustration of char-CNN model (source)

Try it out!

Learn more and contribute

To learn more about MMS, start with our Single Shot Multi Object Detection (SSD) tutorial, which walks you through exporting and serving an SSD model. You can find more examples and documentation in the repository’s model zoo and documentation folder.

output of the SSD object detection model

As we continue to develop MMS, we welcome community participation submitted as questions, requests, and contributions. If you are using MMS already, we welcome your feedback via the repository’s GitHub issues. Head over to awslabs/mxnet-model-server to get started. For further questions, post questions at the MXNet discussion forum.

Dataset citation

Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering, R. He, J. McAuley WWW, 2016, pdf