0% found this document useful (0 votes)

156 views20 pages

SageMaker Algorithms Guide

Uploaded by

liubov.koreva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

156 views20 pages

SageMaker Algorithms Guide

Uploaded by

liubov.koreva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

SageMaker

Built-in
Algorithms
Cheat
Sheet
Linear Learner
Learns a linear function / linear threshold
function, and maps a high-dimensional vector x to
an approximation of the numeric label y.

● Regression
● Binary / multiclass classification

Use cases

● Predict quantitative value based on given

numeric input
○ Estimate this year’s ROI, based on last 5
years ROI
● Discrete binary classification problems
○ Based on past customer response, should I
mail this customer or not?
● Discrete muticlass classification problems
○ Based on past customer response, how
should I reach the customer? Email, DM or
a phone call?
Factorization Machines
Captures interaction between features within
high dimensional sparse datasets.

● Regression
● Binary classification

Use cases

● High dimensional sparse datasets

○ Use given known information about the
person viewing the page based on click-
stream data to calculate which ad user will
click on
● Recommendation engine
○ What to recommend based on user’s
history?
K-Nearest Neighbors
(KNN)
Regression: finds K closest points to the sample
point and returns the average of the feature
values. Classification: queries K points closest to
the sample point and returns most frequently
used label as the predicted label.

● Regression
● Classification

Use cases

● Credit ratings
○ Group people together to credit risk based
on attributes of known credit usage they
share with others
● Recommendation engine
○ Find recommendations based on similar
likes
XGBoost
Predicts a target variable by combining an
ensemble of estimates from a set of simpler and
weaker models.

● Regression
● Classification
● Ranking

Use cases

● Fraud detection
○ Map input transaction to the probability
that it is fraudulent based on dataset of
past transactions and information if they
were fraudulent
● Ranking
○ Return relevance scores for searched
products in a e-commerce system based on
search results, clicks, and past purchases
K-Means
Finds discrete groupings within data, where
members of a group are as similar as possible to
one another and as different as possible from
members of other groups. Euclidean distance
between these points represents similarity of
observations.

● Clustering

Use cases

● Group similar objects/data together

○ Find high-, medium-, and low-spending
customers from their transaction histories
● Handwriting recognition
● Analog audio classification
Random Cut Forest
Detects anomalous data points within a data set
and associates an anomaly score with each data
point. Low score values indicate that the data
point is considered "normal." High values indicate
the presence of an anomaly in the data.

● Anomaly detection

Use cases

● Fraud detection
○ Detect suspicious financial transaction by
unusual amount / time / location and flag
it for a closer look
● Quality control
○ Analyze an audio test pattern played by a
high-end speaker system for any unusual
frequencies
Image Classification
Takes an image as input and outputs one or more
labels assigned to that image. Uses a
convolutional neural network (CNN) that can be
trained from scratch or trained using transfer
learning when a large number of training images
are not available.

● Multi-label classification

Use cases

● Label/tag an image based on the content of

the image
○ Alert about adult content in an image
Object Detection
Takes images as input and identifies all instances
of objects within the image scene. The object is
categorized into one of the classes in a specified
collection with a confidence score that it belongs
to the class. Its location and scale in the image are
indicated by a rectangular bounding box.

● Object detection and classification

Use cases

● Detect people and objects in an image

○ Police review a large photo gallery for a
missing person
Semantic Segmentation
Tags every pixel in an image with a class label
from a predefined set of classes.

● Computer vision

Use cases

● Computer vision
○ Self-driving cars to identify objects in their
way
○ Robot sensing
● Medical imaging diagnostics
Latent Dirichlet Allocation
(LDA)
Describes a set of observations as a mixture of
distinct categories to discover a user-specified
number of topics shared by documents within a
text corpus. The topics are not specified up front,
and are not guaranteed to align with how a
human may naturally categorize documents.

● Topic modeling

Use cases

● Article recommendations based on similarity

○ Recommend articles on similar topics
which you read or rated in the past
● Musical influence modelling
○ Explore which musical artists over time
were truly innovative and those who
influenced from the first ones
Neural Topic Model
(NTM)
Organizes a corpus of documents into topics that
contain word groupings based on their statistical
distribution. The semantics of topics are usually
inferred by examining the top ranking words they
contain. Only the number of topics, not the topics
themselves, are prespecified. The topics are not
guaranteed to align with how a human might
naturally categorize documents.

● Topic modeling

Use cases

● Classify or summarize documents based on

the topics detected
○ Tag a document as belonging to a medical
category based on the terms used in the
document
● Retrieve information or recommend content
based on topic similarities
Sequence To Sequence
(seq2seq)
Converts a sequence of tokens (for example, text,
audio) and the output generated is another
sequence of tokens.

● Machine translation
● Text summarization
● Speech-to-text

Use cases

● Machine translation
○ Convert text from Spanish to English
● Text summarization
○ Summarize a long text corpus: an abstract
for a research paper
● Speech-to-text
○ Convert audio files to text: transcribe call
center conversations for further analysis
Blazing Text
(word2vec)
Used for natural language processing (NLP) tasks.
Maps words to high-quality distributed vectors
and captures the semantic relationships between
words.

Use cases

● Sentiment analysis
○ Evaluate customer comments based on
positive / negative sentiment
● Named entity recognition
● Machine translation
Blazing Text
(Text classification)
Useful for web searches, information retrieval,
ranking, and document classification.
Assigns a set of predefined categories to open-
ended text. Can be used to organise and
categorise almost all kind of text.

Use cases

● Document classification
○ Review a large collection of documents and
detect if they contain sensitive data like
personal information or trade secrets
○ Categorize books in a library into academic
disciplines
● Web searches
● Information retrieval
● Ranking
Object2Vec
Generalizes Word2Vec embedding technique for
words. Converts high-dimensional objects into
low-dimensional space while preserving the
semantics of the relationship between the pairs
in the original embedding space.

Use cases

● Rating prediction
○ Predict movie popularity based on rating
similarity
● Document classification
○ What genre is the book based on its
similarity to known genres?
○ Identify duplicate support tickets and find
the correct routing based on similarity of
text in the tickets
IP Insights
IP anomaly detection. Learns the usage patterns
for IPv4 addresses capturing associations
between IP addresses and various entities such
as user id / account number.

Use cases

● Tiered authentication model

○ Dynamically trigger 2-factor
authentication routine if user logs from an
anomalous IP
● Fraud detection / prevention
○ Permit only certain activities if the IP is
unusual
DeepAR
Forecasts scalar (one-dimensional) time series
using recurrent neural networks (RNN) on
multiple sets of historic data. Extrapolates the
time series into the future.

Use cases

● Forecasting new product sales

○ Predict sales on a new product based on
previous sales data from other products
● Predict labor needs for special events
○ Use labor utilization rates at another
distribution centre to predict the required
level of staffing for a brand new center
Principal Component
Analysis (PCA)
Attempts to reduce the dimensionality (number
of features) within a dataset while still retaining
as much information as possible. Finds a new set
of features called components, which are
composites of the original features that are
uncorrelated with one another.

Use cases

● Feature engineering: dimensionality

reduction
○ Drop those columns from a dataset that
have a weak relation with the label/target
variable: the color of a car when predicting
its mileage
Reinforcement Learning
An area of machine learning concerned with how
intelligent agents ought to take actions in an
environment in order to maximize the notion of
cumulative reward.

Use cases

● Autonomous vehicles
○ Model can learn through iterations of a
trial and error in a simulation. Once the
model is good enough, it can be tested in a
real vehicle on a test track
● Intelligent HVAC control system
○ Model learns about impact of a sunlight
and equipment efficiency to optimize
temperature control for lowest energy
consumption

AI With ICA 18092024 074806pm
No ratings yet
AI With ICA 18092024 074806pm
36 pages
Artificial Intelligence
100% (1)
Artificial Intelligence
47 pages
AIML Overview
No ratings yet
AIML Overview
7 pages
Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
119 pages
Top 9 Machine Learning Applications in Real World
No ratings yet
Top 9 Machine Learning Applications in Real World
7 pages
Machine Learning in Data Science & Big Data Handling"
No ratings yet
Machine Learning in Data Science & Big Data Handling"
55 pages
Intro to Machine Learning Course
No ratings yet
Intro to Machine Learning Course
83 pages
FirstContactWithTensorFlow - Part1 PDF
No ratings yet
FirstContactWithTensorFlow - Part1 PDF
136 pages
Ethics, Uses and Abuses of ML
No ratings yet
Ethics, Uses and Abuses of ML
11 pages
Research Paper On Machine Learning PDF
No ratings yet
Research Paper On Machine Learning PDF
15 pages
001IntroductiontomachinelearningPart I
No ratings yet
001IntroductiontomachinelearningPart I
10 pages
Machine Learning Unit - 1
No ratings yet
Machine Learning Unit - 1
7 pages
AutoML Tools
No ratings yet
AutoML Tools
2 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
33 pages
Exploring Artificial Intelligence Machine Learning
No ratings yet
Exploring Artificial Intelligence Machine Learning
178 pages
Neral Introduction
No ratings yet
Neral Introduction
35 pages
AI Slides1
No ratings yet
AI Slides1
66 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
45 pages
SocrAI Day 1
No ratings yet
SocrAI Day 1
104 pages
Machine Learning-1
100% (1)
Machine Learning-1
9 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
155 pages
The Little Book of Deep Learning François Fleuret Online Version
No ratings yet
The Little Book of Deep Learning François Fleuret Online Version
65 pages
Slide 1 Introduction
No ratings yet
Slide 1 Introduction
33 pages
1 AI - Introduction and ML
No ratings yet
1 AI - Introduction and ML
32 pages
Applied Generative AI and Cloud-Native Developer
No ratings yet
Applied Generative AI and Cloud-Native Developer
6 pages
Module1 - Deep Learning
No ratings yet
Module1 - Deep Learning
26 pages
Ipmv Mod 6
No ratings yet
Ipmv Mod 6
99 pages
Complete Unit-1 Merged
No ratings yet
Complete Unit-1 Merged
74 pages
Machine Learning
100% (2)
Machine Learning
104 pages
Aiml Model
No ratings yet
Aiml Model
13 pages
Unit-2 - Advanced Concepts of Modeling in AI
No ratings yet
Unit-2 - Advanced Concepts of Modeling in AI
4 pages
Reviewer ITP4 DataMining
No ratings yet
Reviewer ITP4 DataMining
5 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
89 pages
Lecture Notes 1 2 Intro Python
No ratings yet
Lecture Notes 1 2 Intro Python
13 pages
EPS DL Handout1 Introduction Compressed
No ratings yet
EPS DL Handout1 Introduction Compressed
46 pages
Library
No ratings yet
Library
23 pages
Unit 1 Introduction
No ratings yet
Unit 1 Introduction
40 pages
Print Report Last PDF New
No ratings yet
Print Report Last PDF New
33 pages
Unit - 5 DL
No ratings yet
Unit - 5 DL
17 pages
Topic 6 Notes
No ratings yet
Topic 6 Notes
3 pages
ML Aa
No ratings yet
ML Aa
83 pages
Lecture 2.2
No ratings yet
Lecture 2.2
19 pages
Report 2
No ratings yet
Report 2
17 pages
Datascience
No ratings yet
Datascience
7 pages
6 DeepLearning
No ratings yet
6 DeepLearning
62 pages
Deep Learning
100% (3)
Deep Learning
32 pages
Generative Certification Notes-1
No ratings yet
Generative Certification Notes-1
22 pages
Lecture 1 - Introduction To ML
No ratings yet
Lecture 1 - Introduction To ML
25 pages
MCA - ML Question Bank Answer
No ratings yet
MCA - ML Question Bank Answer
139 pages
AI - Min Learning
No ratings yet
AI - Min Learning
5 pages
Genai Handout (Handout)
No ratings yet
Genai Handout (Handout)
14 pages
Top 10 Machine Learning Algorithms With Their Use
100% (1)
Top 10 Machine Learning Algorithms With Their Use
12 pages
Python 06 MachineLearning
No ratings yet
Python 06 MachineLearning
45 pages
Unit - 1 Deep Learning Techniques
No ratings yet
Unit - 1 Deep Learning Techniques
18 pages
Midterm Suggestion Worksheet Class XI IPR
No ratings yet
Midterm Suggestion Worksheet Class XI IPR
7 pages
English Language Proficiency Course
No ratings yet
English Language Proficiency Course
15 pages
Cjcook PDF 0.26
100% (1)
Cjcook PDF 0.26
367 pages
DBMS Basics for Class 10 Students
100% (1)
DBMS Basics for Class 10 Students
7 pages
Managing Software Requirements Agile Specifications
100% (6)
Managing Software Requirements Agile Specifications
214 pages
The Planners
No ratings yet
The Planners
10 pages
Exploring Urban Expression Graffiti Art Education Presentation in Blue Cyan Red Flat Graphic Semi Realistic Style
No ratings yet
Exploring Urban Expression Graffiti Art Education Presentation in Blue Cyan Red Flat Graphic Semi Realistic Style
13 pages
Music of Southeast Asian: Lesson
No ratings yet
Music of Southeast Asian: Lesson
22 pages
GR 11 Geo Research Task Loadshedding 2025 (1) - Edited
100% (3)
GR 11 Geo Research Task Loadshedding 2025 (1) - Edited
16 pages
2016-30 Jan-Three Holy Hierarchs
No ratings yet
2016-30 Jan-Three Holy Hierarchs
8 pages
The Different Types of Web
No ratings yet
The Different Types of Web
59 pages
Final Kisi Soal English Provinsi
No ratings yet
Final Kisi Soal English Provinsi
7 pages
Legal Implications of All-Caps Names
No ratings yet
Legal Implications of All-Caps Names
23 pages
Circle Tangents and Properties
No ratings yet
Circle Tangents and Properties
94 pages
HPR Ids
No ratings yet
HPR Ids
6 pages
Lesson 3 Clock - Ball
No ratings yet
Lesson 3 Clock - Ball
7 pages
Math in Society - Lippman, Clifford
No ratings yet
Math in Society - Lippman, Clifford
277 pages
Hazaras As Sabultern Group in Afghanistan
100% (1)
Hazaras As Sabultern Group in Afghanistan
6 pages
Class Time Table 2025 - 26
No ratings yet
Class Time Table 2025 - 26
10 pages
Exam Long Questions
No ratings yet
Exam Long Questions
8 pages
Jee Main-2025 - Important Replica QS - Maths @
No ratings yet
Jee Main-2025 - Important Replica QS - Maths @
307 pages
Experiment Number 3 Basic Router Configuration
No ratings yet
Experiment Number 3 Basic Router Configuration
4 pages
Interfacecomponent Siemens
No ratings yet
Interfacecomponent Siemens
16 pages
Scienceofetymolo 00 Skeauoft
No ratings yet
Scienceofetymolo 00 Skeauoft
274 pages
The Bodhisattvas Confession of Downfalls
No ratings yet
The Bodhisattvas Confession of Downfalls
11 pages
Michel Foucault MCQ - 25906911 - 2025 - 02 - 28 - 14 - 26
No ratings yet
Michel Foucault MCQ - 25906911 - 2025 - 02 - 28 - 14 - 26
5 pages
CCNP ENARSI v8 Final Exam Answers Full - Advanced Routing
No ratings yet
CCNP ENARSI v8 Final Exam Answers Full - Advanced Routing
38 pages
Know Your Lord - LEARN ISLAM PDF
No ratings yet
Know Your Lord - LEARN ISLAM PDF
25 pages
All Programming Interview Q&A
No ratings yet
All Programming Interview Q&A
14 pages
Introduction To The Gospels
No ratings yet
Introduction To The Gospels
11 pages
Tenses & Drill and Substitution
No ratings yet
Tenses & Drill and Substitution
8 pages

SageMaker Algorithms Guide

Uploaded by

SageMaker Algorithms Guide

Uploaded by

SageMaker

● Predict quantitative value based on given

● High dimensional sparse datasets

● Group similar objects/data together

● Label/tag an image based on the content of

● Object detection and classification

● Detect people and objects in an image

● Article recommendations based on similarity

● Classify or summarize documents based on

● Tiered authentication model

● Forecasting new product sales

● Feature engineering: dimensionality

You might also like