Deploying an Azure Machine Learning Model to Azure Container Instances

The Problem

You get a File not found error when trying to load the .pkl file for your model in the init() function of an inference script (score.py) in an Azure ML container.

TLDR;

Jump straight to the solution here.

Introduction

Lately, I have been working with a team of data scientists to develop some innovative machine learning models. My role is to “productionize” these models, building Docker images using Azure Machine Learning, and publish them as services so they can be consumed by our clients. This has been a lot of fun because I am getting the opportunity to work with Azure Machine Learning and to learn some Python along the way.

One of the challenges we have encountered has been using the azureml Python libraries to publish these models to the Azure Container Instances (ACI) service. The challenge is that we have two models, so we registered both models separately using Model.register(). That turned into kind of a mess for reasons that will become clear later in this post. We ended up doing a deep dive into how Azure ML structures the directories in the container, where the .pkl files get dropped and how to locate them at run time.

How Azure ML Organizes Files in the Container

When Azure ML creates the container, it will put the code for your service in the /var/azureml-app folder. Inside that, there will be an azureml-model folder. The container will have an environment variable, AZUREML_MODEL_DIR, that points at the directory the models have been placed in.

Or does it?

This is where things get kind of tricky. The problem is the .pkl files get dropped somewhere in a directory structure inside the azureml-models directory. The actual directory structure changes shape depending on what gets passed to the Model.register() and Model.deploy() methods. The challenge is there may be more than one model registered, each model registration may contain multiple files, and Azure ML adds a version number to the model registration (which I cannot find any way to manage via the API). To make matters worse, where the AZUREML_MODEL_DIR points within that directory structure depends on all those variables. Whew!

Ugh, my head is spinning. I just want to load my model. What is the best/easiest way to deploy the model so I can consume it in a straightforward way?

The easiest way is to drop all the models (.pkl files) into a single folder and register them together.

On your development machine, create a sub directory below the working directory of your Jupyter notebook and put your models in that directory. The actual location of the directory isn’t really important, it can be more or less anywhere. The important thing is the name of the directory. Take note of that because you will need it when loading the models on the container.

Below is a minimalist configuration of your source code. It has a Jupyter notebook and the model in a child “models” directory. It also has the score.py inference file and the Python environment configuration file (config.yaml) for the target Docker image.

.
├── my_notebook.ipynb
├── score.py
├── config.yaml
├── models
│   └── my_model.pkl

├── my_notebook.ipynb

├── score.py

├── config.yaml

├── models

│ └── my_model.pkl

Inside the Jupyter notebook, register the model with code that looks like this:

model = Model.register(model_path='./models', model_name='my_model', workspace=ws)

1	model = Model.register(model_path='./models', model_name='my_model', workspace=ws)

This will register all the files in the “models” directory. This is probably what you want because they will all end up in the same directory in the container.

Deploy the model with code that looks something like this:

aci_service = Model.deploy(workspace=ws,
                       name='my-ml-service',
                       models=[models],
                       inference_config=inference_config,
                       deployment_config=aci_config)

aci_service = Model.deploy(workspace=ws,

name='my-ml-service',

models=[models],

inference_config=inference_config,

deployment_config=aci_config)

Note that the models parameter accepts an array of model registrations, however, if you pass an array of model registrations, the directory structure that is created in azureml-models becomes really complicated. Again, keep it simple. Just pass a single model registration in the array. All of the files in your local models folder will be deployed.

After the container is deployed, you will have an environment variable AZUREML_MODEL_DIR that is set to something like this:

azureml-models/my_model/14

1	azureml-models/my_model/14

The azure-models directory tree will look something like this:

.
├── azureml-models
│   └── my_models       # <- model_name from the Model.register() method
│       └── 14	        # <- version, not sure where this comes from
│           └── models	# <- directory name from the model_path parameter of Model.register()
│               └── my_model.pkl
├── main.py
├── model_config_map.json
└── score.py

├── azureml-models

│ └── my_models # <- model_name from the Model.register() method

│ └── 14 # <- version, not sure where this comes from

│ └── models # <- directory name from the model_path parameter of Model.register()

│ └── my_model.pkl

├── main.py

├── model_config_map.json

└── score.py

As far as I can tell, you do not have any control over how the version number is selected. That is why it is important for the AZUREML_MODEL_DIR to point to a path that includes the version number. If more than one model is deployed, AZUREML_MODEL_DIR will point to a directory above the version number, which makes locating the model challenging.

Loading the Models in the Score.py init() Method

With all of the models in the same folder, and AZUREML_MODEL_DIR pointing at a folder close to the actual model location, you can write some very simple Python code to load the model.

from sklearn.externals import joblib

model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'models')
my_model_path = os.path.join(model_path, 'my-model.pkl')
model = joblib.load(my_model_path)

from sklearn.externals import joblib

model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'models')

my_model_path = os.path.join(model_path, 'my-model.pkl')

model = joblib.load(my_model_path)

Troubleshooting Tips

First, inside the init() method of the inference file, you can print diagnostic messages with the Python print statement. Of course, you can’t directly see the output of those print statements because they are being executed in a container running in Azure. To view what the print statements are outputting, use the following code-snippet after Model.deploy().

print(aci_service.get_logs())

1	print(aci_service.get_logs())

Second, inspecting the azureml-models directory structure is made much easier with the Linux “tree” command. The Tree package isn’t installed on the image by default, but it can be installed in the inference file using the following Python codes.

import os
os.system('apt install tree')

1 2	import os os.system('apt install tree')

The Python os.system()method is used to run an arbitrary command on the host. Here, we used it to use the apt command to install the tree package.

Third, you can configure your container to send telemetry to Azure Application Insights. Add the following to your score.py file.

import azureml.automl.core
from azureml.automl.core.shared import logging_utilities, log_server
from azureml.telemetry import INSTRUMENTATION_KEY

log_server.enable_telemetry(INSTRUMENTATION_KEY)
log_server.set_verbosity('INFO')
logger = logging.getLogger('azureml.automl.core.scoring_script')

import azureml.automl.core

from azureml.automl.core.shared import logging_utilities, log_server

from azureml.telemetry import INSTRUMENTATION_KEY

log_server.enable_telemetry(INSTRUMENTATION_KEY)

log_server.set_verbosity('INFO')

logger = logging.getLogger('azureml.automl.core.scoring_script')

Now, whatever you print using the Python print statement will be sent to the Azure Application Insights environment associated with your Azure Machine Learning Workspace. You can view what gets printed out by running a simple query in the Application Insights workspace. Go to the Application Insights workspace for your Azure Machine Learning resource group, click into the Log tab and create a new query. It should look something like this.

traces 
| where customDimensions['Service Name'] == 'my-ml-service'
| where timestamp > ago(24h) 
| sort by timestamp //desc 
| limit 20
| project timestamp, customDimensions['Service Name'], customDimensions['Content']

traces

| where customDimensions['Service Name'] == 'my-ml-service'

| where timestamp > ago(24h)

| sort by timestamp //desc

| limit 20

| project timestamp, customDimensions['Service Name'], customDimensions['Content']

Summary

In this post, we looked at how to overcome common pitfalls when trying to load the .pkl files in the init() method of the score.py inference files on an Azure Container Instances container. We also looked at easy to implement troubleshooting tips. Finally, how to wire up the container to Azure Application Insights and a query that allows you to see that telemetry.