KEMBAR78
Deep Learning Jump Start | PPTX
Deep Learning Jump Start
Michele Toni, 20 May 2017
Collegio Universitario Bertoni, Milano
Credits to Tommaso Matassini
About me
● M.Sc. in Computer Engineer for Business Administration at Università di Pisa
● 2 years as backend software developer.
● 1+ years as AI team member at Cynny.
More about Cynny: Morphcast site, Cynny Italian site, last public presentation
Today programme
● Deep Learning Showcase
● What is deep learning and how it works
○ Definitions
○ Neural Networks
○ Deep Learning
● How to start with deep learning
● Live demo: image classification Dogs Vs Cats with NVidia Digits
Deep Learning Showcase
Deep Learning applications
Perception: recognizing what's in an image, what people are saying when they are
talking on their phone, helping robots explore the world and interact with it.
Images
Videos
Text
NLP
Music Other
Image Classification
[Try this online with CaffeJS]
Object Detection - 1/2
[Source: Awesome Deep Vision]
Object Detection - 2/2
[Source: DetectNet by Nvidia]
Image Segmentation
[Source: SegNet Online Demo]
Image Captioning
[Source: MS COCO Captioning Challenge 2015]
Image Captioning - Facebook easy approach
[Source: The Verge article]
Neural Art - DeepDream 1/3
[Source: Google Inceptionism]
Neural Art - DeepDream 2/3
[Source: L’Altra Toscana: Garfagnana]
Neural Art - DeepDream 3/3
Neural Art - Style Transfer 1/2
StyleContent
Neural Art - Style Transfer 2/2
[Try this: DeepDream, Prisma, Vinci]
Neural Art - Deep Photo Style Transfer
[Source Code and details: DeepPhoto Github]
Text Generation - Hemingway style
[Source: Hemingway style and Super Mario level generation]
Sequence2Sequence - Language translation
[Source: Attention and Memory in Deep Learning and NLP]
Sequence2Sequence - Chat bot
[Source: ChatBots with seq2seq, seq2seq Github]
Music Generation - Google Magenta and others
[Google Magenta Song, Fake Beatles Song, DeepBach by Sony CSL Music]
Generative Adversarial Network - pix2pix
[Try this: pix2pix Demo]
Generative Adversarial Network - CycleGAN
[Source code and details: CycleGAN Github]
Reinforcement learning - Atari Breakout
[Atari Breakout Video, Flappy Bird, OpenAI Gym]
Image analysis (old) approach: Computer Vision
HOG Face detector
[Source: dlib site]
Local Binary Pattern
[Source]
Why we use Deep Learning
[Source: Nervana Systems]
What is Deep Learning
And
how it works
Definitions
AI Vs Machine Learning Vs Deep Learning
[Source: NVidia Blog]
Artificial Intelligence definition
Artificial intelligence (AI) is an area of computer science that emphasizes the
creation of intelligent machines that work and react like humans.
Some of the activities computers with artificial intelligence are designed for
include:
● Speech recognition
● Learning
● Planning
● Problem solving
[Source: Techopedia]
Machine Learning definition
Machine learning according to Arthur Samuel in 1959, gives "computers the
ability to learn without being explicitly programmed."
It explores the study and construction of algorithms that can learn from and
make predictions on data – such algorithms overcome following strictly
static program instructions by making data-driven predictions or decisions,
through building a model from sample inputs.
[Source: Wikipedia]
Types of learning
● Supervised learning: learn to predict an output when given an input
vector. We know the correct matching between input and output.
● Reinforcement learning: learn to select an action to maximize payoff.
● Unsupervised learning: discover a good internal representation of the
input. There is no known matching between input and output.
[Source: Geoffrey Hinton Neural Networks Coursera Course]
Types of learning: Supervised Learning
Each training case consists of an input vector x and a target ouput t.
● Regression: The target is a real number, e.g. the value of a stock, the
temperature.
● Classification: the target is a class label. E.g. from a given image tell if it
represents a cat or a dog.
[Source: Geoffrey Hinton Neural Networks Coursera Course]
Types of learning: Reinforcement Learning
● The output is an action or a sequence of actions and the only
supervisory signal is an occasional scalar reward. (No one tells which is
the correct action at each step, this has to be learned).
● The goal in selecting each action is to maximize the expected sum of
future rewards.
● Reinforcement learning is difficult, because the rewards can be delayed
and it is hard to know when we are wrong or right.
[Source: Geoffrey Hinton Neural Networks Coursera Course]
Types of learning: Unsupervised Learning 1/2
● Someone doesn’t consider unsupervised learning among the machine
learning techniques because it isn’t trained with input-output mapping.
● A typical example is clustering. E.g. to better visualize the inputs, to
study a problem, to prepare the data for a successive phase using
supervised or reinforcement learning.
[Source: Geoffrey Hinton Neural Networks Coursera Course]
Types of learning: Unsupervised Learning 2/2
[Source: T-Sne Visualization]
Neural Networks
Biological neuron
[Source: Stanford CS231n Course]
Artificial neuron
[Source: Stanford CS231n Course]
Common non linear activation functions
[Source: Machine Learning for artists]
Examples of artificial neural network
[Source: Stanford CS231n Course]
What is a supervised learning model
Model: y = f(x; W)
f is a way to use numerical parameters W (called weights), to map each input x
into a predicted output y.
Learning: the procedure that adjusting the parameters W aims to reduce the
discrepancy between y (model output) and t (target output) for each training
sample.
Example of error function (MSE):
[Source: Geoffrey Hinton Neural Networks Coursera Course]
Training example with linear regression
Problem: we want to predict the price of a house knowing the area.
We have this data (our training dataset):
[Source: Visual and Interactive view of the basic of neural networks]
Linear regression model
[Source: Visual and Interactive view of the basic of neural networks, Wikipedia]
Error (MSE):
Linear regression model training - 1/2
[Source: Visual and Interactive view of the basic of neural networks]
Manual
Descending the gradient of the error (simplified)
[Source: Quora question]
Descending the gradient of the error (reality)
[Source]
Linear regression model training - 2/2
[Source: Visual and Interactive view of the basic of neural networks]
Gradient Descent in action
Deep Learning
Deep Learning definition
“A family of learning methods that use deep architectures to learn high-level
feature representations”.
Examples of deep learning:
● Convolutional neural networks (for images)
● LSTM networks (for sequences like text and music)
[Source: Neural Machine Translation by Jointly Learning to Align and Translate]
Convolutional Neural Network layers
[Source: Stanford CS231n Course, in browser demo]
Convolution - 1/2
● Doesn’t matter where the cat is
● Different position, same cat
● We can share the weights!
[Source: Udacity Deep Learning Course by Google]
Convolution - 2/2
[Source: Stanford CS231n Course]
Graphical Demo
Pooling
[Source: Stanford CS231n Course]
CNN architecture: AlexNet (2012)
[Source: Visualize Neurons from Deep Models]
Convolutional Neural Networks recap
[Source: Siraj Raval Youtube Channel]
How to start with Deep Learning
What you need to train a CNN
DL Framework
Image Dataset Model structure
What you need to run a prediction
DL Framework
Model with trained weights
Datasets for specific tasks and challenges
Common models
View common model architectures online
Name Year Imagenet Top-5 Error #Parameters
Alexnet 2012 16,4 % 60 M
ZF 2013 11,7 % 16 M
GoogleNet 2014 6,7 % 7 M
VGG 2014 7,3 % 138 M
ResNet 2015 3,57 % 60 M
Inception-V4 2016 3,08 % 42 M
Squeezenet 2016 < 19,7 % ~ 1 M
Frameworks (very short list)
Name First release Main contributors Languages
Caffe / Caffe2 2013 / 2017 1 Berkeley University,
2 Facebook, Nvidia
Python, C++, Matlab
TensorFlow
(+ Keras API)
2015 Google Python, C++, Java, Go
Mxnet 2015 DMLC Python, Scala, Matlab,
C++, R, Julia, Go, JS
Torch 2002 (v1), 2015 (v7) Facebook, Twitter,
Google
LUA, C++
Live demo: Dogs Vs Cats
Live demo: Dogs Vs Cats
● Dataset: kaggle competition Dogs Vs Cats
● CNN Framework: Caffe + NVidia DIGITS
● Hardware: CUDA powered notebook (video card NVidia 970m)
● Model: Alexnet
Playground
Some additional AI playground links
DeepTraffic (simple self driving car training by MIT, with leaderboard)
AI Experiments with Google (e.g. AutoDraw, AI Duet)
TensorFlow Playground (try and visualize training)
Useful links to learn
Machine Learning is fun (intuitions behind how deep learning works)
Distill.pub (graphical and interactive paper publications)
Deep Learning Book by Yoshua Bengio
Neural Network Zoo (overview of neural networks architectures)
Dev Blog Nvidia - Parallel For All
9 Deep Learning Papers you need to know about
Awesome Deep Vision (Github papers repository)
Courses
Machine Learning by Andrew Ng (Coursera)
Neural Networks for Machine Learning (Coursera)
Deep Learning by Google (Udacity)
Deep Learning Nanodegree Foundation (Udacity)
Convolutional Neural Networks for Visual Recognition (Stanford)
Creative applications of deep learning with Tensorflow (Kadenze)
Other
Kaggle (machine learning competitions)
OpenAI Gym (environments and challenges to train reinforcement learning
models)
Google.ai
Thank you!

Deep Learning Jump Start

  • 1.
    Deep Learning JumpStart Michele Toni, 20 May 2017 Collegio Universitario Bertoni, Milano Credits to Tommaso Matassini
  • 2.
    About me ● M.Sc.in Computer Engineer for Business Administration at Università di Pisa ● 2 years as backend software developer. ● 1+ years as AI team member at Cynny. More about Cynny: Morphcast site, Cynny Italian site, last public presentation
  • 3.
    Today programme ● DeepLearning Showcase ● What is deep learning and how it works ○ Definitions ○ Neural Networks ○ Deep Learning ● How to start with deep learning ● Live demo: image classification Dogs Vs Cats with NVidia Digits
  • 4.
  • 5.
    Deep Learning applications Perception:recognizing what's in an image, what people are saying when they are talking on their phone, helping robots explore the world and interact with it. Images Videos Text NLP Music Other
  • 6.
    Image Classification [Try thisonline with CaffeJS]
  • 7.
    Object Detection -1/2 [Source: Awesome Deep Vision]
  • 8.
    Object Detection -2/2 [Source: DetectNet by Nvidia]
  • 9.
  • 10.
    Image Captioning [Source: MSCOCO Captioning Challenge 2015]
  • 11.
    Image Captioning -Facebook easy approach [Source: The Verge article]
  • 12.
    Neural Art -DeepDream 1/3 [Source: Google Inceptionism]
  • 13.
    Neural Art -DeepDream 2/3 [Source: L’Altra Toscana: Garfagnana]
  • 14.
    Neural Art -DeepDream 3/3
  • 15.
    Neural Art -Style Transfer 1/2 StyleContent
  • 16.
    Neural Art -Style Transfer 2/2 [Try this: DeepDream, Prisma, Vinci]
  • 17.
    Neural Art -Deep Photo Style Transfer [Source Code and details: DeepPhoto Github]
  • 18.
    Text Generation -Hemingway style [Source: Hemingway style and Super Mario level generation]
  • 19.
    Sequence2Sequence - Languagetranslation [Source: Attention and Memory in Deep Learning and NLP]
  • 20.
    Sequence2Sequence - Chatbot [Source: ChatBots with seq2seq, seq2seq Github]
  • 21.
    Music Generation -Google Magenta and others [Google Magenta Song, Fake Beatles Song, DeepBach by Sony CSL Music]
  • 22.
    Generative Adversarial Network- pix2pix [Try this: pix2pix Demo]
  • 23.
    Generative Adversarial Network- CycleGAN [Source code and details: CycleGAN Github]
  • 24.
    Reinforcement learning -Atari Breakout [Atari Breakout Video, Flappy Bird, OpenAI Gym]
  • 25.
    Image analysis (old)approach: Computer Vision HOG Face detector [Source: dlib site] Local Binary Pattern [Source]
  • 26.
    Why we useDeep Learning [Source: Nervana Systems]
  • 27.
    What is DeepLearning And how it works
  • 28.
  • 29.
    AI Vs MachineLearning Vs Deep Learning [Source: NVidia Blog]
  • 30.
    Artificial Intelligence definition Artificialintelligence (AI) is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans. Some of the activities computers with artificial intelligence are designed for include: ● Speech recognition ● Learning ● Planning ● Problem solving [Source: Techopedia]
  • 31.
    Machine Learning definition Machinelearning according to Arthur Samuel in 1959, gives "computers the ability to learn without being explicitly programmed." It explores the study and construction of algorithms that can learn from and make predictions on data – such algorithms overcome following strictly static program instructions by making data-driven predictions or decisions, through building a model from sample inputs. [Source: Wikipedia]
  • 32.
    Types of learning ●Supervised learning: learn to predict an output when given an input vector. We know the correct matching between input and output. ● Reinforcement learning: learn to select an action to maximize payoff. ● Unsupervised learning: discover a good internal representation of the input. There is no known matching between input and output. [Source: Geoffrey Hinton Neural Networks Coursera Course]
  • 33.
    Types of learning:Supervised Learning Each training case consists of an input vector x and a target ouput t. ● Regression: The target is a real number, e.g. the value of a stock, the temperature. ● Classification: the target is a class label. E.g. from a given image tell if it represents a cat or a dog. [Source: Geoffrey Hinton Neural Networks Coursera Course]
  • 34.
    Types of learning:Reinforcement Learning ● The output is an action or a sequence of actions and the only supervisory signal is an occasional scalar reward. (No one tells which is the correct action at each step, this has to be learned). ● The goal in selecting each action is to maximize the expected sum of future rewards. ● Reinforcement learning is difficult, because the rewards can be delayed and it is hard to know when we are wrong or right. [Source: Geoffrey Hinton Neural Networks Coursera Course]
  • 35.
    Types of learning:Unsupervised Learning 1/2 ● Someone doesn’t consider unsupervised learning among the machine learning techniques because it isn’t trained with input-output mapping. ● A typical example is clustering. E.g. to better visualize the inputs, to study a problem, to prepare the data for a successive phase using supervised or reinforcement learning. [Source: Geoffrey Hinton Neural Networks Coursera Course]
  • 36.
    Types of learning:Unsupervised Learning 2/2 [Source: T-Sne Visualization]
  • 37.
  • 38.
  • 39.
  • 40.
    Common non linearactivation functions [Source: Machine Learning for artists]
  • 41.
    Examples of artificialneural network [Source: Stanford CS231n Course]
  • 42.
    What is asupervised learning model Model: y = f(x; W) f is a way to use numerical parameters W (called weights), to map each input x into a predicted output y. Learning: the procedure that adjusting the parameters W aims to reduce the discrepancy between y (model output) and t (target output) for each training sample. Example of error function (MSE): [Source: Geoffrey Hinton Neural Networks Coursera Course]
  • 43.
    Training example withlinear regression Problem: we want to predict the price of a house knowing the area. We have this data (our training dataset): [Source: Visual and Interactive view of the basic of neural networks]
  • 44.
    Linear regression model [Source:Visual and Interactive view of the basic of neural networks, Wikipedia] Error (MSE):
  • 45.
    Linear regression modeltraining - 1/2 [Source: Visual and Interactive view of the basic of neural networks] Manual
  • 46.
    Descending the gradientof the error (simplified) [Source: Quora question]
  • 47.
    Descending the gradientof the error (reality) [Source]
  • 48.
    Linear regression modeltraining - 2/2 [Source: Visual and Interactive view of the basic of neural networks] Gradient Descent in action
  • 49.
  • 50.
    Deep Learning definition “Afamily of learning methods that use deep architectures to learn high-level feature representations”. Examples of deep learning: ● Convolutional neural networks (for images) ● LSTM networks (for sequences like text and music) [Source: Neural Machine Translation by Jointly Learning to Align and Translate]
  • 51.
    Convolutional Neural Networklayers [Source: Stanford CS231n Course, in browser demo]
  • 52.
    Convolution - 1/2 ●Doesn’t matter where the cat is ● Different position, same cat ● We can share the weights! [Source: Udacity Deep Learning Course by Google]
  • 53.
    Convolution - 2/2 [Source:Stanford CS231n Course] Graphical Demo
  • 54.
  • 55.
    CNN architecture: AlexNet(2012) [Source: Visualize Neurons from Deep Models]
  • 56.
    Convolutional Neural Networksrecap [Source: Siraj Raval Youtube Channel]
  • 57.
    How to startwith Deep Learning
  • 58.
    What you needto train a CNN DL Framework Image Dataset Model structure
  • 59.
    What you needto run a prediction DL Framework Model with trained weights
  • 60.
    Datasets for specifictasks and challenges
  • 61.
    Common models View commonmodel architectures online Name Year Imagenet Top-5 Error #Parameters Alexnet 2012 16,4 % 60 M ZF 2013 11,7 % 16 M GoogleNet 2014 6,7 % 7 M VGG 2014 7,3 % 138 M ResNet 2015 3,57 % 60 M Inception-V4 2016 3,08 % 42 M Squeezenet 2016 < 19,7 % ~ 1 M
  • 62.
    Frameworks (very shortlist) Name First release Main contributors Languages Caffe / Caffe2 2013 / 2017 1 Berkeley University, 2 Facebook, Nvidia Python, C++, Matlab TensorFlow (+ Keras API) 2015 Google Python, C++, Java, Go Mxnet 2015 DMLC Python, Scala, Matlab, C++, R, Julia, Go, JS Torch 2002 (v1), 2015 (v7) Facebook, Twitter, Google LUA, C++
  • 63.
  • 64.
    Live demo: DogsVs Cats ● Dataset: kaggle competition Dogs Vs Cats ● CNN Framework: Caffe + NVidia DIGITS ● Hardware: CUDA powered notebook (video card NVidia 970m) ● Model: Alexnet
  • 65.
  • 66.
    Some additional AIplayground links DeepTraffic (simple self driving car training by MIT, with leaderboard) AI Experiments with Google (e.g. AutoDraw, AI Duet) TensorFlow Playground (try and visualize training)
  • 67.
    Useful links tolearn Machine Learning is fun (intuitions behind how deep learning works) Distill.pub (graphical and interactive paper publications) Deep Learning Book by Yoshua Bengio Neural Network Zoo (overview of neural networks architectures) Dev Blog Nvidia - Parallel For All 9 Deep Learning Papers you need to know about Awesome Deep Vision (Github papers repository)
  • 68.
    Courses Machine Learning byAndrew Ng (Coursera) Neural Networks for Machine Learning (Coursera) Deep Learning by Google (Udacity) Deep Learning Nanodegree Foundation (Udacity) Convolutional Neural Networks for Visual Recognition (Stanford) Creative applications of deep learning with Tensorflow (Kadenze)
  • 69.
    Other Kaggle (machine learningcompetitions) OpenAI Gym (environments and challenges to train reinforcement learning models) Google.ai
  • 70.

Editor's Notes

  • #6 Deep learning is emerging as a central tool to solve perception problems in recent years. It's the state of the art having to do with computer vision and speech recognition. But there's more; increasingly, people are finding that deep learning is a much better tool to solve problems like discovering new medicines, understanding natural language, understanding documents, and for example, ranking them for search.
  • #26 HOG Image: https://www.researchgate.net/profile/Shadrokh_Samavi/publication/269074001/figure/fig1/AS:295521232146434@1447469160525/Fig-1-HOG-calculation-a-gradients-in-a-cell-b-histogram-of-gradients.png
  • #42 Esempio prezzo dell’affitto come output, in input metri quadrati, vicinanza al centro, numero di bagni. Flusso da input ad output con risposta corretta attraverso weights “magici”. Slide successiva come apprendere i valori dei pesi. Accennare a training, cosa definisco a questo punto e cosa devo fornire.
  • #48 Udacity gradient descent video: https://www.youtube.com/watch?v=7sxA5Ap8AWM&t=3m
  • #51 Anticipare feature di livello crescente/descrescente nella rete, es. Per gatto servono orecchie, occhi, ovali, linee, … ResNet 152.
  • #57 Stop al dropbout
  • #61 Imagenet, MS COCO, PASCAL, Places
  • #62 Alexnet, VGG, GoogleNet, ResNet, Inception V4, Squeezenet con anno, imagenet error, numero di layer (o #parametri)