Deep Learning with
“What is deep learning?”
Machine learning is turning things (data)
into numbers and finding patterns in those
numbers.
The computer does this part.
How?
Code & math.
We’re going to be writing the code.
Machine Learning vs. Deep Learning
Arti cial
Intelligence
Machine
Learning
Deep
Learning
fi
Inputs Rules Output
programming 1. Cut vegetables
Traditional
2. Season chicken
3. Preheat oven
4. Cook chicken for 30-minutes
5. Add vegetables
Starts with Makes
Inputs Output Rules
Machine learning
1. Cut vegetables
algorithm
2. Season chicken
3. Preheat oven
4. Cook chicken for 30-minutes
5. Add vegetables
Starts with Figures out
“Why use machine learning (or
deep learning)?”
Good reason: Why not?
Better reason: For a complex
problem, can you think of all the rules?
(probably not)
Source: 2020 Machine Learning Roadmap video.
(maybe not very simple…)
“If you can build a simple rule-based system
that doesn’t require machine learning, do
that.”
— A wise software engineer… (actually rule 1 of Google’s Machine Learning Handbook)
What deep learning is good for 🤖✅
• Problems with long lists of rules—when the traditional
approach fails, machine learning/deep learning may help.
• Continually changing environments—deep learning can
adapt (‘learn’) to new scenarios.
• Discovering insights within large collections of data—can
you imagine trying to hand-craft rules for what 101 di erent
kinds of food look like?
ff
y p ic a l l y )
(t
What deep learning is not good for 🤖🚫
• When you need explainability—the patterns learned by a deep
learning model are typically uninterpretable by a human.
• When the traditional approach is a better option — if you can
accomplish what you need with a simple rule-based system.
• When errors are unacceptable — since the outputs of deep
learning model aren’t always predictable.
• When you don’t have much data — deep learning models
usually require a fairly large amount of data to produce great
results.
(though we’ll see how to get great results without huge amounts of data)
Machine Learning vs. Deep Learning
Machine Learning
Deep Learning
d ient
g r a
r i t h m:
Alg o
a c h ine ral
st e d m : n e u
bo o it hm
o r
Alg rk
e t w o
n
Structured data Unstructured data
Machine Learning vs.(coDeep Learning
mmon algorithms)
• Random forest • Neural networks
• Gradient boosted models • Fully connected neural network
• Naive Bayes • Convolutional neural network
• Nearest neighbour • Recurrent neural network
• Support vector machine • Transformer
• …many more these are rn i n g
• …many more
o f d ee p l e a ” )
a d v en t o r i t h m s
(since the d to a s “ s h a l l o w a l g
r e
often refer
What we’re focused on building
(with PyTorch)
(depending how you represent your problem,
many algorithms can be used for both)
Structured data Unstructured data
“What are neural networks?”
Neural Networks a n
(a hum d these)
ca n
e r s t a n
und
Ramen,
(before data gets used Each of these nodes is Spaghetti
with a neural network, called a “hidden unit”
it needs to be turned or “neuron”.
into numbers)
[[116, 78, 15], [[0.983, 0.004, 0.013],
[117, 43, 96], [0.110, 0.889, 0.001], Not a diaster
[125, 87, 23], [0.023, 0.027, 0.985],
…, …,
p r o p r i a te
t h e a p
(choo s e
o rk fo r y o u r “Hey Siri, what’s
n e t w
n e u r a l the weather
problem)
today?”
Learns
Numerical representation Representation
Inputs Outputs
encoding (patterns/features/weights) outputs
Anatomy of Neural Networks
Overall
architecture
Input layer Output layer
(data goes in here) (outputs learned representation or
# units/neurons = 2 prediction probabilities)
# units/neurons = 1
Each laye
r is usual
linear (st ly combin
raight lin ation of
linear (no e) and/or
t-straight non-
line) fun
ctions
Hidden layer(s)
(learns patterns in data)
# units/neurons = 3
Note: “patterns” is an arbitrary term, you’ll often hear “embedding”, “weights”, “feature representation”,
“feature vectors” all referring to similar things.
Types of Learning
Supervised Unsupervised & Transfer
Learning Self-supervised Learning
Learning
We’ll be writing code to do these,
but the style of code can be adopted across learning paradigms.
“What is deep learning actually
used for?”
Source: 2020 Machine Learning Roadmap video.
(s o m e )
Deep Learning Use Cases
“Hey Siri, who’s the
biggest big dog of
them all?”
Recommendation Translation Speech recognition
Sequence to sequence
To: daniel@mrdbourke.com (seq2seq)
To: daniel@mrdbourke.com
Hey Daniel, Hay daniel…
This deep learning course is incredible! C0ongratu1ations! U win $1139239230
I can’t wait to use what I’ve learned!
Not spam Spam Classi cation/regression
Computer Vision Natural Language Processing (NLP)
fi
“What is PyTorch?”
What is PyTorch?
• Most popular research deep learning framework*
• Write fast deep learning code in Python (able to run on a GPU/many
GPUs)
• Able to access many pre-built deep learning models (Torch Hub/
torchvision.models)
• Whole stack: preprocess data, model data, deploy model in your
application/cloud
• Originally designed and used in-house by Facebook/Meta (now open-
source and used by companies such as Tesla, Microsoft, OpenAI)
*Source: paperswithcode.com/trends February 2022
Why PyTorch?
Research favourite
Source: paperswithcode.com/trends February 2022
Why PyTorch?
P y Torc h
an d
Source: @fchollet Twitter
Why PyTorch?
What is a GPU/TPU?
TPU (Tensor Processing Unit)
GPU (Graphics Processing Unit)
“What is a tensor?”
Neural Networks a n
(a hum d these)
ca n
e r s t a n
und
These a Ramen,
(before data gets used
re tens Spaghetti
with an algorithm, it
needs to be turned into
numbers)
o rs!
[[116, 78, 15], [[0.983, 0.004, 0.013],
[117, 43, 96], [0.110, 0.889, 0.001], Not spam
[125, 87, 23], [0.023, 0.027, 0.985],
…, …,
p r o p r i a te
t h e a p
(choo s e
o rk fo r y o u r “Hey Siri, what’s
n e t w
n e u r a l the weather
problem)
today?”
Learns
Numerical representation Representation
Inputs Outputs
encoding (patterns/features/weights) outputs
These a
re tens
o rs!
[[116, 78, 15], [[0.983, 0.004, 0.013],
[117, 43, 96], [0.110, 0.889, 0.001], Ramen,
[125, 87, 23], [0.023, 0.027, 0.985], Spaghetti
…, …,
Learns
Numerical representation Representation
Inputs Outputs
encoding (patterns/features/weights) outputs
“What are we going to
cover?”
Source: @elonmusk Twitter
What we’re going to cover
(broadly)
• Now:
• PyTorch basics & fundamentals (dealing with tensors and tensor operations)
• Later:
• Preprocessing data (getting it into tensors)
• Building and using pretrained deep learning models
• Fitting a model to the data (learning patterns)
• Making predictions with a model (using patterns)
• Evaluating model predictions
• Saving and loading models
• Using a trained model to make predictions on custom data
👩🍳 👩🔬
(we’ll be cooking up lots of code!)
How:
What we’re going to cover
A PyTorch workflow
(one of many)
“How should I approach
this course?”
How to approach this course
# 2 : n t ,
ot to ri m e
M e x p e
e n t , t ! Motto #3:
e r im im e n
x p e r Visualize, visualize, visualize!
E exp
1. Code along
b t, ru n the c o d e! 2. Explore and 3. Visualize what you
Motto #1 : if in d o u
experiment don’t understand
(including the
“dumb” ones)
🛠 🤗
4. Ask questions 5. Do the exercises 6. Share your work
How not to approach this course
🔥🔥
Avoid: 🔥🧠 “I can’t learn
______”
This course Resources
Course materials Course Q&A Course online book
https://www.github.com/mrdbourke/pytorch-deep-learning/
https://www.github.com/mrdbourke/pytorch-deep-learning https://learnpytorch.io
discussions
PyTorch website &
forums
All things PyTorch
Let’s code!
Tensor dimensions
tensor([[[1, 2, 3],
dim=0 [3, 6, 9],
[2, 4, 5]]]) Dimension (dim)
0 1 2
tensor([[[1, 2, 3], 0
dim=1 [3, 6, 9], 1 torch.Size([1, 3, 3])
[2, 4, 5]]]) 2
tensor([[[1, 2, 3],
dim=2 [3, 6, 9],
[2, 4, 5]]])
0 1 2
Dot product
A B C J K A*J + B*L + C*N A*K + B*M + C*O
torch.matmul( D E F , L M ) D*J + E*L + F*N D*K + E*M + F*O
G H I N O G*J + H*L + I*N G*K + H*M + I*O
3x3 3x2 3x2
Numbers on the inside must match New size is same as outside numbers
5 0 3 4 7 5 0 3 44 38
torch.matmul( 3 7 9 6 8 ) * * * 126 86
,
4 6 8
3 5 2 8 1 58 63
= = =
3x3 3x2 3x2
20 + 0 + 24 = 44
For a live demo, checkout www.matrixmultiplication.xyz
2. Show examples
m
Supervised 1. In i t i a l i s e w i t h
b e
ra
g
n
i n
d
n
o
i n g )
a t
learning we i g h ts ( o n l y
[[0.092, 0.210, 0.415],
(overview) [0.778, 0.929, 0.030],
[0.019, 0.182, 0.555],
…,
[[116, 78, 15], [[0.983, 0.004, 0.013],
[117, 43, 96], [0.110, 0.889, 0.001], Ramen,
[125, 87, 23], [0.023, 0.027, 0.985], Spaghetti
…, …,
4. Repeat with more 3. Update representation
examples outputs
Learns
Numerical representation Representation
Inputs Outputs
encoding (patterns/features/weights) outputs
Tensor attributes
Attribute Meaning Code
The length (number of elements) of
Shape tensor.shape
each of the dimensions of a tensor.
The total number of tensor
dimensions. A scalar has rank 0, a
Rank/dimensions tensor.ndim or tensor.size()
vector has rank 1, a matrix is rank 2, a
tensor has rank n.
Speci c axis or dimension (e.g. “1st
A particular dimension of a tensor. tensor[0], tensor[:, 1]…
axis” or “0th dimension”)
fi