KEMBAR78
CSE 445 - Lecture 1 - Machine Learning Introduction | PDF | Machine Learning | Applied Mathematics
0% found this document useful (0 votes)
35 views23 pages

CSE 445 - Lecture 1 - Machine Learning Introduction

CSE 445: Machine Learning is a course taught by Intisar Tahmid Naheen at North South University, covering essential concepts, resources, and project requirements. The course emphasizes various machine learning techniques, including supervised, unsupervised, and reinforcement learning, while addressing common challenges like overfitting and model evaluation. Recommended resources include textbooks and online courses, with a focus on practical applications and group projects.

Uploaded by

sarwar76200
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views23 pages

CSE 445 - Lecture 1 - Machine Learning Introduction

CSE 445: Machine Learning is a course taught by Intisar Tahmid Naheen at North South University, covering essential concepts, resources, and project requirements. The course emphasizes various machine learning techniques, including supervised, unsupervised, and reinforcement learning, while addressing common challenges like overfitting and model evaluation. Recommended resources include textbooks and online courses, with a focus on practical applications and group projects.

Uploaded by

sarwar76200
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

CSE 445: Machine Learning

Introduction

Instructor: Intisar Tahmid Naheen, North South University


Resources
▪ Slides provided in course should be enough – but there is a plethora of
fantastic resources available, so use them!
▪ Recommended Books:
▪ Hands-On Machine Learning with Scikit-Learn, Keras, and Tensorflow by Aurelien
Geron (will be followed extensively in the course with code examples from
https://github.com/ageron/handson-ml )
▪ Pattern Recognition and Machine Learning by Christopher Bishop (excellent
resource for mathematical foundations)
▪ Elements of Statistical Learning by Jerome Friedman et al (good reference)
▪ Additional Material:
▪ Andrew Ng’s course on Machine Learning available on Coursera
▪ CS 189, Berkeley
▪ CS 229, Stanford
Helpful Prerequisites
▪ MAT361- Probability & Statistics
▪ Probability distribution, Random Variable, Conditional Probability, Variance (some
of the important concepts to recall to name a few)
▪ MAT125 – Linear Algebra
▪ Matrix Multiplication, Eigenvalues, Eigenvectors

▪ Basic programming background in Python (an OK understanding of python


syntax is all that’s necessary – Geron’s textbook has excellent code examples)

▪ None of them are compulsory – easier to grasp the material if completed


Course Project
▪ Groups of up to 3 members (3 is a hard maximum)
▪ Video Demo submission at the end of the semester, and in-person/online
presentation at the end of the semester
▪ 4-6 page Report due at semester end, IEEE format – must include link to Github
repo

▪ Potential Topics (few examples):


▪ Covid-19
▪ Computer Vision
▪ Natural Language Processing
▪ Reinforcement Learning
▪ Speech & Music Recognition
▪ Biomedical Imaging and Biosignals
What is Machine Learning?
Tom Mitchell (1998): a computer program is
said to learn from experience E with respect
to some class of tasks T and performance
measure P, if its performance at tasks in T, as
measured by P, improves with experience E.

Example:
Task: Playing Checkers
Experience (data): games played by the
program (with itself)
Performance measure: winning rate Image from Tom Mitchell’s homepage
Definition of Machine Learning
Arthur Samuel (1959): Machine Learning is the
field of study that gives the computer the ability
to learn without being explicitly programmed.

Photos from Wikipedia


Traditional Programming

• Traditional Programming: writing a set of RULES to find


ANSWERS from DATA
The ML Approach
Machine Learning: Use DATA and ANSWERS to learn the underlying set of RULES

Great for:

• Problems that require a lot of fine-


tuning or long list of rules
• Changing environments – ML
systems can ADAPT

• Getting insights from large amounts


of data

• Complex problems that yields no good


solution with traditional approach
Deep Learning
▪ Subset of ML - loosely mimics
structure/function of human brain
▪ Unlike traditional ML, does not require
manual feature extraction
▪ Keeps getting better with more data
(typically)
▪ Computer Vision (CNN, GAN)
▪ Natural Language Processing (RNN,
LSTM)
▪ Automatic Speech Recognition (RNN)
Summary – AI vs ML vs DL
▪ Subsets of each other
▪ 1950 – 1990: AI in the form of Expert systems (airplane
autopilot) and Games (checkers, chess)
▪ 1990- : Statistical Approaches with ML, busts AI winter
▪ 2010 - : Deep Learning revolutionizes CV, NLP among
other applications
▪ Narrow AI
▪ Systems can do a few defined things (such as playing
chess, or driving a car) as well, or better than humans
▪ Can’t do EVERYTHING a human being can do – yet
▪ AI is not “taking over the world” anytime soon
▪ Tell your uncles to relax and stop using Whatsapp
What kind of ML system is it?
▪ Useful to classify ML systems based on the following criteria:
1. Does it require human supervision? 3. Does the system build a predictive model?
➢ Model-based Learning
➢ Supervised Learning
➢ Instance-based Learning
➢ Semisupervised Learning
➢ Unsupervised Learning
➢ Reinforcement Learning
• These are not exclusive – can be
combined
2. Can it learn incrementally on the fly?
➢ Online Learning
• e.g. Spam filter may learn on the fly
➢ Batch Learning with a deep neural network – online,
model-based, supervised learning
system
Supervised Learning
▪ Training data fed to algorithm
includes the desired
answers/solutions (labels)
▪ Example algorithms:
▪ Linear Regression
▪ Logistic Regression
▪ SVM
▪ Decision Tree
▪ Neural Network
Unsupervised Learning
▪ Training data is unlabeled
▪ System learns without direct human
supervision
▪ Widely used in:
▪ Clustering
▪ Anomaly detection
▪ Association mining
▪ Data preprocessing
▪ Example algorithms:
▪ K-means
▪ PCA
▪ SVD
▪ ICA
Semisupervised Learning
▪ Partially labeled data
▪ Unsupervised learning used
to cluster similar data
together
▪ Human input taken to label
the clusters
▪ e.g. Google Photos will
cluster similar faces, and ask
the user if they are the
same person
Reinforcement Learning
▪ The learning system (agent) can:
▪ Observe the environment
▪ Select and perform an action
▪ Get rewards/penalties as a result
▪ Learns what the best policy should be
▪ Policy defines what actions should be
chosen in a certain situation
▪ Very effective in controlled
environments (such as a game of chess)
▪ With the progress in deep learning,
increasingly used in more complex
tasks (such as driving the mars rover)
Batch Learning vs Online Learning
▪ Batch Learning
▪ Not capable of learning after
deployment
▪ Must be retrained from scratch –
computationally expensive!
▪ Online Learning
▪ Can continue to learn after
deployment
▪ Can take advantage of parallel
computing – no down time
▪ Preferred choice in production
Example ML Task: Does money make people happy?

• Life Satisfaction data from OECD


• GDP per capita data from IMF

What relationship
can we infer between
life satisfaction and
GDP per capita from
the graph?
Problems with Machine Learning
▪ 3 V’s of Big Data
▪ Volume, Variety, Velocity
▪ Problem #1: Training data!
▪ Insufficient quantity
▪ Nonrepresentative data
▪ Poor-quality data
▪ Problem #2: How “fit” is it?
▪ Overfitting data
▪ Underfitting data
▪ Problem #3: Which features should be used?
▪ Deep Learning automates feature selection
Overfitting
▪ Most common problem in ML – do not overgeneralize!
▪ The polynomial model is better than the linear model on training
▪ How about testing?
How to avoid overfitting
▪ Tip #1: REGULARIZATION – USE IT
▪ Constrain model to keep it simple – reduce risk of overfitting
▪ If you can stand on one leg, you’ll be able to stay balanced with two legs
▪ Hyperparameters – control level of regularization
▪ Tip #2: Get more training data, and reduce noise in it
Model Evaluation
▪ How good is your model?
▪ Test it on new data – data not seen by the model ever before!
▪ Keep 80% for training, set 20% for testing
▪ NEVER go below 10% test data – better model is better than better
“accuracy”
▪ How to regularize?
▪ Keep a portion of training data held out for validation
▪ Alternatively, use cross-validation (many validation sets instead of one)
▪ Pick the hyperparameters that work best on validation for your model
on the test dataset
Ratios
▪ A great model
▪ trained with 60% training data, 20% validation data, and 20% testing data
▪ An okay model
▪ trained with 70% training data, 15% validation data, and 15% testing data
▪ A barely-acceptable model
▪ trained with 80% training data, 10% validation data, and 10% testing data
▪ Models with worse ratios – hacks
▪ Unless there’s millions of instances in the dataset
▪ “No Free Lunch” theorem
▪ Only way to know for sure which model works best is to evaluate them
▪ Make reasonable assumptions about your data to select model

You might also like