AIS302: ANN (Artificial Neural Networks)
Lecture 1: Introduction
Spring 2025
Dr. Ensaf Hussein
Associate Professor, Artificial Intelligence,
School of Information Technology and Computer Science,
Nile University.
Course learning outcomes
• Understand the principles behind deep learning
• Have practical experience applying deep learning to novel situations
• By the end of the semester, you should be able to
• Look at a problem
• Identify if DL could be a solution
• Apply the appropriate DL model
2
Prerequisites
AIS301: Machine
Learning
AIS302: Artificial Neural AIS411: Natural Language AIS462: Computational
Networks Processing Intelligence
AIS412: Deep Learning AIS421: NLP Applications
Course tentative schedule
• Deep Neural Networks and Hyperparameter tuning
• Regularization and Optimization
• Sequence Modeling: Recurrent and Recursive Nets
• Sequence models & attention mechanism
• Autoencoders / Applications
• Sequence-to-sequence models
• Attention / Multi-head attention
• Transformers
• Vision Transformers (ViT).
4
Grading Policy
This course is Project-Based Learning.
• Grading items:
❑ Classwork
❑10% Lab Tasks
❑10% Programming Assignments
❑10% Lecture Quizzes
❑ 20% Midterm Exam
❑ 10 % Project (Follow-Up)
❑Final assessment
❑ 10 % Final Project
❑ 30% Final Exam
•students with less than 30% in final exam will get an F in the course
• Students should attend 75% of lectures and labs to enter the final exam 5
Resources:
Refernece : https://www.deeplearningbook.org/
Textbook
• A great book!
• Very recent (and up-to-date!)
• Easy to read and follow
• Very nice figures!
• Neither terribly theoretical nor
extremely about coding!
• Available freely online by author
https://udlbook.github.io/udlbook/
• Many slides are adopted from
content by Simon.
8
Topics
Deep neural networks
How to train them
How to measure their performance
How to make that performance better
Networks specialized to images
Image classification
Image segmentation
Networks specialized to text
Text generation
ChatGPT
Selective
Generative learning (unsupervised)
based on
Generating random cats!
time
RL using Deep Learning
9
Warm-Up
https://play.blooket.com/play?id=3123283
2
What is AI?
•AI aims to simulate intelligent behavior.
3
What is Machine Learning?
•ML is a subset of AI that learns from data to make decisions.
4
5
What is Deep Learning?
•Deep Learning (DL) is a type of ML based on deep neural
networks.
What is Deep Neural
Network?
A Deep Neural Network (DNN) is a machine learning model with multiple layers of artificial neurons,
enabling it to learn complex patterns from data. It consists of an input layer, multiple hidden layers,
and an output layer, using backpropagation and gradient descent for training. DNNs are widely used
in image recognition, speech processing, and natural language understanding, powering applications
like self-driving cars, chatbots, and medical diagnostics. 7
8
Figures from http://udlbook.com
17
AI is all about deep learning
➢ Yes
➢ No
…...... is fitting mathematical models to observed data
➢ AI
➢ Machine Learning
➢ Deep Learning
18
Deep learning == Machine learning?
➢ Yes
➢ No
…...... is a type of machine learning
➢ Deep Neural Networks
➢ Supervised Learning
➢ Deep Learning
19
6
Figures from http://udlbook.com
12
Supervised learning
• Define a mapping from input to output
• Learn this mapping from paired input/output (labeled) data examples
Model Model
Real world input Real world output
input output
22
Simple example …
• Predict the height of a child given his/her age.
Model Model
Age of child Height of child
input output
23
What is a supervised learning model?
Regression
• An equation relating input (age) to output (height)
• Search through family of possible equations to find one that fits training data well
24
Figures from http://udlbook.com
What is a supervised learning model?
Deep neural networks are just a very flexible family of equations
Fitting deep neural networks = “Deep Learning”
25
Figures from http://udlbook.com
Regression
Deep learning
model
Univariate regression problem (one output, real value)
Fully connected network
26
Figures from http://udlbook.com
Graph regression
Deep learning
model
Multivariate regression problem (>1 output, real value)
Graph neural network
27
Figures from http://udlbook.com
Text classification
Deep learning
model
Binary classification problem (two discrete classes)
Transformer network
28
Figures from http://udlbook.com
Music genre classification
Deep learning
model
Multiclass classification problem (discrete classes, >2 possible values)
Recurrent neural network (RNN)
29
Figures from http://udlbook.com
Image classification
Deep learning
model
Multiclass classification problem (discrete classes, >2 possible values)
Convolutional network
21
Figures from http://udlbook.com
Image segmentation
• Multivariate binary classification problem (many outputs, two discrete classes)
• Convolutional encoder-decoder network
Depth estimation
• Multivariate regression problem (many outputs, continuous)
• Convolutional encoder-decoder network
Pose estimation
• Multivariate regression problem (many outputs, continuous)
• Convolutional encoder-decoder network
Terms
• Regression = continuous numbers as output
• Classification = discrete classes as output
• Two class and multiclass classification treated differently
• Univariate = one output
• Multivariate = more than one output
34
35
“Given a news article, I want to predict if it is political,
sports, or economical”.
What type of a problem is that?
➢ multivariate classification
➢ univariate regression
➢ binary classification
➢ multivariate regression
➢ multiclass classification
36
Other type of examples …
37
Translation
Deep learning
model
38
Figures from http://udlbook.com
Image captioning
Deep learning
model
39
Figures from http://udlbook.com
Image generation from text
Deep learning
model
40
Figures from http://udlbook.com
What do these examples
have in common?
41
What do these examples have in common?
• Very complex relationship between input and output
• Sometimes may be many possible valid answers
• But outputs (and sometimes inputs) obey rules
Language obeys grammatical rules Natural images also have “rules”
Learn the “grammar” of the data
from LOTS of unlabeled examples
42
Figures from http://udlbook.com
33
Unsupervised Learning
• Learning from data without labels
• Clustering
• Finding outliers
• Generating new examples
• Filling in missing data
34
DeepCluster: Deep Clustering for Unsupervised Learning of Visual Features (Caron et al., 2018) 35
Figures from http://udlbook.com
Unsupervised Learning
• Learning about a dataset without labels
• e.g., clustering
• Generative models can create examples
• e.g., generative adversarial networks
36
Unsupervised Learning
• Learning about a dataset without labels
• e.g., clustering
• Generative models can create examples
• e.g., generative adversarial networks
• PGMs learn distribution over data
• e.g., variational autoencoders,
• e.g., normalizing flows,
• e.g., diffusion models
47
Figures from http://udlbook.com
Generative models
48
Figures from http://udlbook.com
Generative models
49
Figures from http://udlbook.com
Generative models
50
Figures from http://udlbook.com
Conditional synthesis
Original image Removed regions Generated regions
51
Figures from http://udlbook.com
Figures from http://udlbook.com 42
ChatGPT
53
54
45
Reinforcement learning
• Build an agent which lives in a world and can perform certain actions
at each time step.
• Goal: take actions to change the state so that you receive rewards
• You don’t receive any data – you have to explore the environment
yourself to gather data as you go.
46
Example: chess
• States are valid states of the chess board
• Actions at a given time are valid possible moves
• Positive rewards for taking pieces, negative rewards for losing
them
57
Figures from http://udlbook.com
Example: chess
• States are valid states of the chess board
• Actions at a given time are valid possible moves
• Positive rewards for taking pieces, negative rewards for losing
them
58
Figures from http://udlbook.com
Why is this difficult?
• Stochastic
• Make the same move twice, the opponent might not do the same thing
• Rewards also stochastic (opponent does or doesn’t take your piece)
• Temporal credit assignment problem
• Did we get the reward because of this move? Or because we made good
tactical decisions somewhere in the past?
• Exploration-exploitation trade-off
• If we found a good opening, should we use this?
• Or should we try other things, hoping for something better?
59
Landmarks in (Machine) Deep Learning
• 1958 Perceptron (Simple `neural’ model)
• 1986 Backpropagation (Practical Deep Neural networks)
• 1989 Convolutional networks (Supervised learning)
• 2012 AlexNet Image classification (Supervised learning)
• 2014 Generative adversarial networks (Unsupervised learning)
• 2014 Deep Q-Learning -- Atari games (Reinforcement learning)
• 2016 AlphaGo (Reinforcement learning)
• 2017 Machine translation (Supervised learning)
• 2019 Language models ((Un)supervised learning)
• 2022 Dall-E2 Image synthesis from text prompts ((Un)supervised learning)
• 2022 ChatGPT ((Un)supervised learning)
• 2023 GPT4 Multimodal model ((Un)supervised learning)
60
2018 Turing award winners
61
Where are we going?
• Supervised learning (overview with regression example)
• Shallow neural networks (a more flexible model)
• Deep neural networks (an even more flexible model)
• Loss functions (guiding the training)
• How to train neural networks (gradient descent and variants)
• How to measure performance of neural networks (generalization)