Services for Running Machine Learning
This section will focus on the different services AWS provides for creating, training, and running your own machine learning models. AWS also provides a number of services that make use of prebuilt models, which we’ll focus on later.
SageMaker—AWS’s Machine Learning Suite
SageMaker is AWS’s fully managed service for creating and training machine learning models. It’s designed to replace all the manual work involved with configuring servers for training and inference. You’ll still need to pay the EC2 costs associated with your models, but the process itself is managed for you.
Running machine learning on SageMaker is highly recommended—the cost savings alone are worth it. With SageMaker, you only pay for the cost of running your training, and not any downtime where your server isn’t running anything. This is a big deal, especially considering the instances usually used to train machine learning models are equipped with 8 Tesla V100 Tensor Core GPUs and cost a fortune to run per hour. SageMaker also offers the option to easily use Spot Instances, which usually saves around 60% over on demand costs.
SageMaker is free to use, but overall it does cost slightly more per instance, per hour than an EC2 setup (similar to RDS’s pricing model). They get you in the fine print, though it’s still cheaper and easier to use than a manually created spot instance-based training fleet.
SageMaker Ground Truth
Ground Truth is a feature of SageMaker that automates the process of creating datasets for models to train with. A common problem in machine learning is labeling data; say you’re building an AI to tell apples apart from oranges. If you have 1,000 photos of apples and oranges stored in S3, you’ll have to go through each one and mark it correctly.
SageMaker makes this process easy by providing an interface for automated labeling and the option to outsource the labor. For small datasets, you can probably spend an hour or so labeling it yourself. But if you’ve got a lot of content that you need labeled, you can set up users with IAM roles to let your employees handle it, or you can outsource it to Mechanical Turk workers. The price for the outsourcing option is $0.08 per object up to 50,000 objects, and $0.04 after that. Otherwise, Ground Truth itself is free, plus the cost of you or your employee’s time.
Elastic Inference—Running ML Models in Production
Training machine learning models is a one-time fee, and it takes significantly more processing power than performing inference (making predictions using trained models). Inference is what you’ve built the model to do, and you still need to pay the compute costs.
However, inference rarely uses a full GPU worth of power, especially the gigantic Tesla V100 found on AWS’s P3 lineup, let alone eight of them. So, rather than rent expensive GPU-backed inference as your daily driver for your inference needs, you can instead attach elastic accelerators to your instances that give your EC2 instances a small amount of GPU power for a fraction of the cost.
These accelerators can be managed with EC2 Auto Scaling to scale up and down based on load. You’ll pay per hour based on the size of the accelerator you attach. You can view the types of accelerators on the Elastic Inference pricing page.
AI Powered Services
AWS features a whole host of awesome services powered by AI. This section will focus on the services that use machine learning under the hood but don’t require any machine learning knowledge to use them. You’re free to use these services in your own applications and reap the benefits of machine learning without training your own models.
Speech Recognition and Text-to-Speech
AWS Polly is a service that generates realistic audio from text. Their neural voices are fantastic, and sound much more like Alexa than Microsoft Sam. The words flow together much like a human would say them, rather than a computer interpreting the syllables.
AWS Transcribe is a service for transcribing text from audio for speech recognition. It works much like AWS Polly, but in reverse—give it an audio file and it will (to the best of its ability) output the words being spoken. This service isn’t in realtime, so it’s mostly useful for transcribing call logs or automatically subtitling videos.
AWS Lex is a service that combines the two to build chatbots like Alexa. Lex can transcribe spoken instructions understand the intent behind them, and can automatically run actions from Lambda based on commands given to it. It’s a very powerful service,
Image Recognition
AWS Rekognition is a toolkit for running image recognition on video and images. It can identify all sorts of objects with varying confidence levels, analyze faces for emotion, track multiple people in a video, and scan for inappropriate content.
AWS Textract is a tool for optical character recognition (OCR) that can read documents and output the info contained within them. Traditional OCR has existed for a while, but Textract takes it to the next level by using machine learning to understand the structure of the data automatically, rather than having to write custom code or fine-tune algorithms to support changes in page structure. The data can then be stored and searched with AWS Elasticsearch for later analysis.
Translation
Language translation is an area where machine learning thrives, as understanding the complex relations between words is far too complicated to perfectly program in the traditional sense.
AWS Translate is a fairly simple translation service. You put in text, select the language to translate to and from, and it gives you the correct translation. It charges per character, in true AWS fashion, at a rate of $15.00 per million characters.
Product Recommendations
One of the biggest motivators behind machine learning development, like most things in this world, is money and advertising. Finding better ways to advertise to people on a more personal level has always been the main challenge of the marketing industry.
Machine learning introduces algorithms tailored specifically to a user’s interests that can change to meet fluctuating demand. They’re crazy accurate—way better than traditional personalized ads—but are only really useful for things like online stores.
For another use case, YouTube uses an algorithm like this to recommend videos to you. The goal is to get you to watch for as long as possible, so you can think of videos with longer watch time as scoring higher. It uses lots of factors, such as video content, tags, title, the preferences of other people who liked similar content, as well as your own watch history to suggest new content to you.