In my last post, I introduced Amazon SageMaker, Amazon’s fully-managed service for building and deploying machine learning models in production. We took a high-level look at SageMaker’s architecture, examining how different AWS services, like EC2, ECR, and S3, are tied together for the machine learning platform. We also discussed the benefits of using SageMaker and introduced several ways to interact with the tool.
In this post I’d like to present a tutorial of how to use SageMaker for model training. I’m really excited about this post since it’s my first video tutorial. If you find the tutorial valuable, please let me know in the comments so I can keep making more videos : )
Notes
-
The first link I have open in the tutorial is the AWS documentation for creating a notebook instance on SageMaker: Step 2: Create an Amazon SageMaker Notebook Instance.
-
There is an error in the tutorial when testing the Docker image locally. The
serve
script expects incoming data to have a label in the first column (I’m sure this is an error in the tutorial). To fix this, I created a new file and inserted a NULL first column. You can do this by running commandawk -v label="NULL" -v OFS=, '{print label, $0}' payload.csv > payload-fixed.csv
. Kudos to this StackOverflow question.