MACHINE LEARNING
I NTRODUCTION TO MACHINE LEARNING
Last updated on 28/07/2020 09:07
Unless stated otherwise, images, code and text is based on course book Deep Learning
with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann
©2020 by Manning Publications Co. All rights reserved.
INTRODUCTION TO ML
‣ Overview
‣ Deep supervised, unsupervised and
reinforcement learning
‣ Deep learning revolution
‣ Competitive landscape
‣ PyTorch overview
‣ Hardware and software requirements
INTRODUCTION TO ML
‣ Overview
What is Machine Learning?
Machine learning is a subfield of
artificial intelligence
Then... what is Artificial Intelligence?
Artificial intelligence (AI) is the ability of machines to perform tasks that are typically associated with human
intelligence, such as learning and problem-solving
✓ Artificial Intelligence was coined by John McCarty in 1955
✓ It’s the science of making machines acting like humans
✓ AI wraps lots of different approaches to this challenging science but all
pursue the same target
Artificial Intelligence classification
Attending to its purpose and functionality, we can divide AI
into these blocks
Artificial
Intelligence
Scope Functionality
Limited Theory of Self
Narrow General Strong Reactive
Memory mind awareness
Type of Artificial Intelligence
Type definitions
1.Artificial Narrow Intelligence: AI designed to complete very specific actions; unable to independently learn.
2.Artificial General Intelligence: AI designed to learn, think and perform at similar levels to humans.
3.Artificial Superintelligence: AI able to surpass the knowledge and capabilities of humans.
4.Reactive Machines: AI capable of responding to external stimuli in real time; unable to build memory or store
information for future.
5.Limited Memory: AI that can store knowledge and use it to learn and train for future tasks.
6.Theory of Mind: AI that can sense and respond to human emotions, plus perform the tasks of limited memory
machines.
7.Self-aware: AI that can recognize others’ emotions, plus has sense of self and human-level intelligence; the final
stage of AI.
Machine Learning
But... where is Machine Learning in the
previous classification?
Machine Learning
Machine Learning
Machine Learning is a set of tools that
allows computers to find by themselves
the best algorithm to help solve
specific purposes.
Machine Learning
Machine Learning is all about Math tools all put together in an outstanding order
✓ Linear Algebra
✓ Calculus
✓ Statistics
✓ Numerical analysis
✓ Operations research (optimization algorithms)
Machine Learning main blocks in a nutshell
✓ Data representation as tensors (multidimensional
arrays)
✓ Computational graphs as a representation of
mathematical transformations of tensors.
✓ Training, loss function and optimizers
✓ Training data management
✓ Neural net architecture
✓ System architecture: hardware CPU/GPU/TPU and
distributed training
✓ Metrics evaluation
✓ Deployment of trained model
INTRODUCTION TO ML
‣ Overview
‣ Deep supervised, unsupervised and
reinforcement learning
Machine Learning
Machine Learning: algorithms that improve through experience and data
Machine learning: design and implementation of of computer algorithms that improve automatically using:
❑ Data; and/or
❑ Experience: interaction with a real or simulated environment
Three broad categories:
❑ Supervised learning
❑ Unsupervised learning
❑ Reinforcement learning
We will see examples of deep learning for each of these categories.
Overview
Supervised learning
Learning a function that maps input to output based on a labeled dataset.
Training examples/observations: input as a vector or real numbers, desired output is also a value (otherwise, we
need to develop a proper representation)
❖ Output is a real value: regression
❖ Output is one among a limited number of classes: classification
After training can then infer the values for new, unseen examples.
❑ Natural language processing: e.g., translation, automatic speech recognition
❑ Medical: image processing, e.g., detecting lung tumoral nodules
Deep supervised learning
Deep learning for automated scoring of liver fibrosis stages from microscopy images.
Unsupervised learning
Unsupervised learning: data contains inputs (not labeled)
Target is to find structure in data
Examples of deep unsupervised learning:
- Autoencoder: efficient data encodings (representations). Given a dataset, learn a low dimensional
representation of the data by learning to ignore noise.
- Generative Adversarial Networks (GAN): learn to generate new examples ”similar” (i.e., hard to distinguish)
from the original dataset. Example: new synthetic features.
- Anomaly detection: identify samples that are not “fit” the pattern of the data.
Example of an autoencoder
Reinforcement learning
Reinforcement learning: agent that take actions in an
environment to maximize the (expected) cumulative reward.
Usually, the target model of the decision process that represents the environment is not known (or too
complex to build a Markov Decision Process). Learning happens through interaction.
Key difference with supervised learning: actions and data are coupled. Exploration vs. exploitation.
Examples:
Agent
- Games (DOTA2, Go, chess)
- Personalized Recommendations (adapting) State: Return: Action:
𝑠𝑡 , 𝑠𝑡+1 … 𝑟𝑡 , 𝑟𝑡+1 … 𝑎𝑡 , 𝑎𝑡+1 …
- Robotics
- Traffic light control
Environment
Deep reinforcement learning
Example, DOTA 2
OpenAI Five wins back-to-back games versus Dota 2
world champions OG at Finals, becoming the first AI to
beat the world champions in an esports game.
https://www.twitch.tv/videos/410533063?t=44m53s
#AI bots just beat humans at the video game Dota 2.
That’s a big deal, because their victory required
teamwork and collaboration – a huge milestone in
advancing artificial intelligence.
https://openai.com/blog/openai-five/
Dota 2 is a multiplayer online battle arena (MOBA) video game developed and published by Valve
INTRODUCTION TO ML
‣ Overview
‣ Deep supervised, unsupervised and
reinforcement learning
‣ Deep learning revolution
The deep learning revolution
Until the last decade, the broader class of systems that
fell under the label ML relied heavily on feature
engineering.
• Features are transformations on input data
that facilitate a downstream algorithm, like a
classifier, to produce correct outcomes on
new data.
DL finds the representations automatically, from
raw data
DL exchanges the need to handcraft features for an
increase in data and computational requirements.
What do we mean by training?
We want to obtain useful representations and make the machine produce desired outputs
During training, we use a criterion, a real-valued function of model outputs and reference data, to provide
a numerical score for the discrepancy between the desired and actual output of our model
• by convention, a lower score is typically better, and we use the term loss
Training consists of driving the criterion toward lower and lower scores by incrementally modifying our
deep learning machine until it achieves low scores, even on data not seen during training.
INTRODUCTION TO ML
‣ Overview
‣ Deep supervised, unsupervised and
reinforcement learning
‣ Deep learning revolution
‣ Competitive landscape
The deep learning competitive landscape
Language:
- Python, by far (no doubt). Interpreted but PyTorch is written in C++ and CUDA, a C++-like language from
NVIDIA. So heavy-lifting happens outside Python, and in specialized hardware.
- C++, not first choice. Only when running things very close to the hardware, like new code directly on GPUs.
For inference, it is possible to export a python trained model to a C++ runtime.
- Java, R (small on purpose)
Libraries on Python:
By 2021, the community largely consolidated behind either PyTorch or TensorFlow
• TensorFlow has a robust pipeline to production, an extensive industry-wide community, and massive
mindshare.
• PyTorch initially, research and teaching communities, thanks to its ease of use. Now great
momentum into industry.
Search interest (red TensorFlow, blue PyTorch)
Last 5 years
Search interest (red TensorFlow, blue PyTorch)
Last 12 months
Search interest (September 2021 to September 2022)
Why PyTorch
Most LLM models are based on Pytorch
❑ Simplicity
❑ Pythonic
❑ History:
- Static (graph) execution vs. Dynamic execution. Flexibility vs. Speed.
- Now both operation modes are possible with TF and PyTorch.
❑ Features of PyTorch and TF have mostly converged.
❑ Most LLMs are coded in Pytorch (Flan-T5, LLAMA,Bert)
INTRODUCTION TO ML
‣ Overview
‣ Deep supervised, unsupervised and
reinforcement learning
‣ Deep learning revolution
‣ Competitive landscape
‣ PyTorch overview
Computational graphs, automatic differentiation, optimization
Pytorch is all about tensors manipulation
• Tensors as Data:
• multidimensional arrays, or tensors and an extensive library of operations on them
• Both tensors and the operations on them can be used on the CPU or the GPU/TPU.
• Tensors as part of a Computational Graph:
• ability of tensors to keep track of the operations performed on them, building a CG
• Automatic differentiation and numerical optimization:
• compute derivatives of an output of a computation with respect to any of its inputs.
• used for numerical optimization, and it is provided natively through autograd engine under the
hood.
Overview
INTRODUCTION TO ML
‣ Overview
‣ Deep supervised, unsupervised and
reinforcement learning
‣ Deep learning revolution
‣ Competitive landscape
‣ PyTorch overview
‣ Hardware and software requirements
Hardware and software requirements
Training:
❖ Simple models we will use in class, any recent laptop or personal computer
❖ For the more advanced models (optional):
• 2 x GPU 6-8 GB RAM each one (for example, GTX 1660)
• 200 GB disk
Inference:
❖ Any recent laptop or personal computer
Google Colaboratory (https://colab.research.google.com)