Need an HTTP API for an AWS Sagemaker endpoint? Well, you could glue together:

  • AWS API Gateway
  • An AWS Lambda function
  • An appropriate IAM role+policy
  • Your Sagemaker Model Endpoint

…and have a basic web service for your ML model as outlined in this AWS Machine Learning Blog Post. But, doesn’t that seem like a lot of services just to call invoke_endpoint from the outside world?

What if you could have an HTTP API for your AWS Sagemaker ML model without writing any code or creating a Rube Goldberg machine of AWS services? What if that API could run model inference on many records and not just a single record as on the AWS blog? What if developers could pass through feature key/values rather an ambiguous array of floats for inference? What if you could easily see a log of all predict results and error messages for each model invocation?

what if

In this tutorial, I’ll swallow the red pill and show how to quickly create a proper HTTP API for your Sagemaker ML model with

How works

Booklet creates an HTTP API for your Sagemaker model without any code or extra libraries to install. Here’s an overview of how Booklet works:

  1. Grant read-only access to a limited number of AWS Sagemaker actions.
  2. Choose the Sagemaker endpoints you’d like to integrate with Booklet in our UI.
  3. Booklet hosts an HTTP API endpoint for your AWS Sagemaker endpoint. hosts the web app for your first model for free.

Read below for full details.

Signup for

Booklet is free to sign up for, no credit card required. Sign up below:

Sign Up for  ▶

Create an AWS Sagemaker Endpoint


This tutorial assumes you’ve already deployed an ML model to AWS Sagemaker and created an endpoint for the model. See the AWS docs on hosting services for information on this process.

Grant access to AWS Sagemaker

You need to grant us read-only access to a limited number of Sagemaker actions via an IAM role that is associated with our AWS account.

Follow these steps to create a read-only IAM Role for Booklet:

  1. Create a new role in the AWS IAM Console.
  2. Select “Another AWS account” for the Role Type.
  3. Enter “256039543343” in the Account ID, field (this is the AWS account id).
  4. Click the “Next: Permissions” button.
  5. Click the “Create Policy” button (opens a new window).
  6. Select the JSON tab. Copy and paste this JSON into the text area.
  7. Click “Review policy”.
  8. Name the policy “BookletAWSIntegrationPolicy”.
  9. Click “Create policy” and close the window.
  10. Back in the “Create role” window, refresh the list of policies and select the policy you just created.
  11. Click “Next: Review”.
  12. Give the role a name such as “BookletAWSIntegrationRole”. Click “Create Role”.
  13. Copy the Role ARN. It looks like something like “arn:aws:iam::123456789012:role/BookletIntegrationRole”.

With the AWS Role created and the ARN on your clipboard, we’re almost there. In the settings, paste in the AWS Role ARN and click the “Save” button:


For more information, check out the docs

Booklet and AWS are now integrated!

Create the web app for your Sagemaker endpoint

Click the “New Model” button within, choose the Sagemaker endpoint you’d like to wrap with a Booklet-hosted HTTP API, and click “Create”:

endpoint name ui

Believe it or not, you have an HTTP API for your Sagemaker model! Let’s try it.

Calling your Sagemaker HTTP API

In Booklet, switch to the ‘API’ section of the ML Model web app you just created. This displays a sample HTTP call to invoke your Sagemaker model via cURL and Python. The code samples will look a lot like this:

curl -XPOST -H "Content-Type: application/json" -d \
    "data": [[5.4,2.1],[2.5,1.0]]

Try copying & pasting the command above into your terminal. It runs inference for a public Iris Model.

The command returns something like:

  results: ['Iris Virginica','Iris Versicolor'],
  log: "[UUID]"

You’re probably handing this API off to a development team that has no clue on the feature ordering your predict function expects. How can you describe the schema to developers that need to use your model?

Create a custom HTTP API Schema

Rather than providing an ambiguous array of floats as inputs, you can make the expected inputs easier for developers to understand by providing a feature schema. When viewing your model, click “Settings” and scroll to the “Features schema” configuration setting.

Below is an example schema for an Iris prediction model. It’s just a JSON-formatted Array with information about each feature:

    "name": "Petal Length (cm)",
    "default": 5.4
    "name": "Petal Width (cm)",
    "default": 2.1

With a features schema, your HTTP ML Model API is now a lot friendlier. You can pass an Array of objects with feature keys and values and not worry about ordering:

curl -XPOST -H "Content-Type: application/json" -d \
    "data": [
        "Petal Length (cm)": 5.4,
        "Petal Width (cm)": 2.1

Now, that’s the happy path with valid input data. What if we send an incorrectly encoded feature?

Logging & Error Monitoring

Inevitably bad data will be sent to your model. How does handle this? Let’s see in this example:

curl -XPOST -H "Content-Type: application/json" -d \
    "data": [[5.4,"about as long as my finger"]]

Which returns something like:

  error: true,
  results: [],
  log: "[UUID]"

You can access the log provided via the URL for more information on the error.

API Authentication

You can choose to make your ML Model HTTP API public (anyone can view the model) or secret (share with a hard-to-guess URL):



Tada! You have a proper HTTP API for your Sagemaker ML Model without having to tie together a number of AWS services. You’ve deployed this for your first model for free. Signup for early access to learn more.

You may also enjoy...