Introduction to Deep Learning
Agenda
• What is Deep Learning?
• Key Concepts in Deep Learning
• Neural Networks: The Building Blocks
• Popular Deep Learning Architectures
• Applications of Deep Learning
• Tools and Frameworks
• Getting Started with Deep Learning
What is Deep Learning?
• A subset of machine learning that uses neural
networks with multiple layers to model complex
patterns in data.
• Key Features:
- Hierarchical feature learning
- Automatic feature extraction
- High accuracy in tasks like image recognition, NLP, etc.
• Difference from Machine Learning:
- Deep learning excels with large datasets and unstructured
data.
Key Concepts in Deep Learning
• Neurons: Basic units of computation.
• Layers: Input, hidden, and output layers.
• Activation Functions: ReLU, Sigmoid, Tanh.
• Weights and Biases: Parameters learned
during training.
• Loss Function: Measures model performance.
• Optimization: Gradient descent and
backpropagation.
Gradient descent is a mathematical technique that iteratively finds the weights and bias
that produce the model with the lowest loss.
Neural Networks: The Building Blocks
• • Structure:
• - Input Layer: Receives data.
• - Hidden Layers: Perform computations.
• - Output Layer: Produces predictions.
• • Training Process:
• - Forward Propagation: Compute output.
• - Backpropagation: Adjust weights to minimize
loss.
• • Example: Simple feedforward neural network.
Neuron
Neuron
• The neuronal perception of deep
learning is generally motivated by
two main ideas:
– It is assumed that the human brain
proves that intelligent behavior is
possible, and— by reverse engineering, it
is possible to build an intelligent system
– Another perspective is that to understand
the working of the human brain and the
principles that underlie its intelligence is
to build a mathematical model that could
shed light on the fundamental scientific
questions.
Deep Learning vs. Machine Learning
•Deep Learning can essentially do everything that
machine learning does, but not the other way
around.
•For instance, machine learning is useful when the
dataset is small and well-curated, which means that the
data is carefully preprocessed.
•Data preprocessing requires human intervention. It
also means that when the dataset is large and complex,
machine learning algorithms will fail to extract
information, and it will underfit.
Deep Learning vs. Machine Learning
• Generally, machine learning is alternatively termed
shallow learning because it is very eff ective for
• smaller datasets.
Deep learning, on the other hand, is extremely
• powerful when the dataset is large.
It can learn any complex patterns from the data
and can draw accurate conclusions on its own. In
fact, deep learning is so powerful that it can
even process unstructured data - data that is not
• adequately arranged like text corpus, social
media activity, etc.
Furthermore, it can also generate new data
samples and fi nd anomalies that machine
learning algorithms and human eyes can miss.
Deep Learning vs. Machine Learning
Why Deep Learning ?
Popular Deep Learning Architectures
• Convolutional Neural Networks (CNNs):
- Used for image processing.
- Layers: Convolutional, Pooling, Fully Connected.
• Recurrent Neural Networks (RNNs):
- Used for sequential data (e.g., time series, text).
- Variants: LSTM, GRU.
• Generative Adversarial Networks (GANs):
- Used for generating new data (e.g., images,
music).
• - Components: Generator and Discriminator.
Applications of Deep Learning
• • Computer Vision: Image classification, object
detection.
• • Natural Language Processing (NLP): Sentiment
analysis, machine translation.
• • Speech Recognition: Voice assistants,
transcription.
• • Healthcare: Disease detection, drug discovery.
• • Autonomous Vehicles: Self-driving cars.
Deep Learning: Limitations
• Data availability
• The complexity of the
• model Lacks global
• generalization
• Incapable of
Multitasking
Hardware dependence
Deep Learning: Limitations
•Data availability
– Deep learning models require a lot of data to learn the
representation, structure, distribution, and pattern of
the data.
– If there isn't enough varied data available, then the
model will not learn well and will lack generalization (it
won't perform well on unseen data).
– The model can only generalize well if it is trained on
large amounts of data.
Deep Learning: Limitations
• The complexity of the model
– Designing a deep learning model is often
a trial and error process.
– A simple model is most likely to underfit,
i.e. not able to extract information from
the training set, and a very complex model
is most likely to overfit, i.e., not able to
generalize well on the test dataset.
– Deep learning models will perform well
when their complexity is appropriate to the
complexity of the data.
Deep Learning: Limitations
Lacks global generalization
•
– Neural networks have a lot of parameters—sometimes
tens of thousands or more. Ideally, all these parameters
should work together to make the model perform well
on new data (not just the training data). This is called
global generalization.
– Due to its complex nature, it’s impossible to perfectly
adjust all parameters to minimize errors on unseen data.
Some parts of the model might overfit to the training data,
while others might not generalize well.
– As a result, deep learning models will always have some
level of error when making predictions, which can
sometimes lead to incorrect results.
Deep Learning: Limitations
• Incapable of Multitasking
– Deep neural networks are
incapable of multitasking.
– These models can only perform targeted
tasks, i.e., process data on which they are
trained. For instance, a model trained on
classifying cats and dogs will not classify
men and women.
– Furthermore, applications that require
reasoning or general intelligence are
completely beyond what the current
generation’s deep learning techniques can
do, even with large sets of data.
Deep Learning: Limitations
• Hardware dependence
– As mentioned before, deep learning
models are computationally expensive.
– These models are so complex that a normal CPU
will not be able to withstand the
computational complexity.
– However, multicore high-performing graphics
processing units (GPUs) and tensor processing
units (TPUs) are required to effectively train
these models in a shorter time.
– Although these processors save time,
they are expensive and use large
amounts of energy.
Deep Learning: Applications
Tools and Frameworks
• TensorFlow: Open-source library by Google.
• PyTorch: Popular for research and
development.
• Keras: High-level API for building neural
networks.
• Other Tools: NumPy, Pandas, Matplotlib for
data manipulation and visualization.
Getting Started with Deep Learning
• Practice with Projects:
- Start with simple datasets (e.g., MNIST).
• Join Communities:
- Kaggle, GitHub, Stack Overflow.
Future of Deep Learning
• Advancements: Explainable AI, transfer
learning, reinforcement learning.
• Integration: Combining deep learning with
other technologies (e.g., IoT, blockchain).
• Ethical Considerations: Bias, fairness, and
privacy.
Lecture: Chapter 1 of Deep Learning
www.deeplearningbook.org
Ian Goodfellow
Representations Matt er
Cartesian coordinates Polar coordinates
θ
y
x r
Figure 1.1
Depth: Repeated Composition
Output
CAR PERSON ANIMAL
(object identity)
3rd hidden layer
(object parts)
2nd hidden layer
(corners and
contours)
1st hidden layer
(edges)
Visible layer
(input pixels)
Figure 1.2
Computati onal Graphs
Elemen Elemen
t Set t Set
+
+
X
Logistic
X X
Logistic Regression
Regression
w1 x1 w2 x2
Figure 1.3
Machine Learning and AI
Deep learning Example:
Shallow
Example: Example:
Example: autoencoders
Logistic Knowledge
MLPs
regression bases
Representation learning
Machine learning
AI
Figure 1.4
Learning Multiple
Components
Figure 1.5 Output
Mapping from
Output Output
features
Additional
Mapping from Mapping from layers of more
Output
features features abstract
features
Hand- Hand-
Simple
designed designed Features
features
program features
Input Input Input Input
Classic
Rule-based Deep
machine
systems Representationlearning
learning
learning
Tentative lectures’ topics 1. Introduction
Pa r t I: Applied Math and Machine Learning Basics
3. Probability and
2. Linear Algebra
Information Theory
4. Numerical 5. Machine Learning
Computation Basics
Pa r t II: Deep Networks: Modern Practices
Figure 1.6 6. Deep Feedforward
Networks
7. Regularization 8. Optimization 9. CNNs 10. RNNs
11. Practical
12. Applications
Methodology
Pa r t III: Deep Learning Research
13. Linear Factor 15. Representation
14. Autoencoders
Models Learning
16. Structured 17. Monte Carlo
Probabilistic Models Methods
18. Partition
19. Inference
Function
20. Deep Generative
Models
Historical Trends: Growing Datasets
109
Dataset size (number examples)
Canadian Hansard
108 WMT Sports-1M
107 ImageNet10k
Public S V H N
106
Criminals ImageNet I L S V R C 2014
105
104 MNIST C I FA R - 10
103
T vs. G vs. F Rotated T vs. C
102 Iris
101
1900 1950 1985 2000 2015
100
Figure 1.8 (Goodfellow 2016)
The MNIST Dataset
Figure 1.9 (Goodfellow 2016)
Connections per Neuron
104 Human
6 C at
Connecti ons per neuron
9 7
4
103 Mouse
2
10
5
8
102 3 Fruit fly
1
101
1950 1985 2000 2015
Figure 1.10 (Goodfellow 2016)
Number of neurons (logarithmic scale) Number of Neurons
1011 Human
1010
17 20
16 19 Octopus
14 18
109
11 Frog
108 8
107 3 Bee
106 A nt
105 Leech
13
104
1 2 12
103 15 Roundworm
6 9
102 5 10
101 4 7
100 Sponge
1950 1985 2000 2015 2056
10—1
10—2
Figure 1.11 (Goodfellow 2016)