ITA6016 -MACHINE
LEARNING
Dr.M.Revathi
Assistant Professor / SCOPE
VIT Chennai
m.revathi@vit.ac.in
Course Outcomes and Syllabus
1. Good understanding of the fundamental issues and challenges of machine learning
2. Analyze the strengths and weaknesses of many popular machine learning
approaches.
3. Appreciate the underlying relationships within and across machine learning
algorithms.
4. Characterize the paradigms of supervised, semi-supervised and unsupervised
learning.
5. Ability to recognize and implement various ways of selecting suitable model
parameters for different machine learning techniques.
6. Understand how to perform evaluation of machine learning algorithms and model
selection.
7. Design and Implement various machine learning algorithms
• Syllabus
1
Reference Books
• “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”,
Aurélien Geron, Oreilly Publications
Assessments & Attendance
• DA1
• DA2
• DA3
• Midterm examination (Starts from 11.07.2022)
• Last Instructional Day-02.08.2022
• Final Assessment test - 03-08-2022 onwards
• 75% attendance is mandatory for appearing for the examinations (Mid Term
Tests and Final Assessment Tests)
2
Module 1: Machine Learning
Foundations
Three types of Machine Learning, Supervised Learning, Reinforcement
Learning, Unsupervised Learning, Machine Learning Systems, Preprocessing,
Training and Choosing Predictive Models, Model Evaluation and Validation of
unseen data instances
Machine Learning
• Science (and art) of programming computers so they can learn from data
• Field of study that gives computers the ability to learn without being explicitly
programmed
Analyses
Predicts
OUTPUT
Trains
3
Machine Learning
The Machine Learning approach
The traditional approach
Applications of Machine Learning
4
Types of Machine Learning
Supervised Learning
• The training set you feed to the algorithm includes the desired solutions, called
labels
5
Supervised Learning
A regression problem: predict a value, given an input feature (there are usually multiple input
features, and sometimes multiple output values)
Ex: To predict a target numeric value, such as the price of a car, given a set of features
(mileage, age, brand, etc.) called predictors.
Supervised Learning
• k-Nearest Neighbors
• Linear Regression
• Logistic Regression
• Support Vector Machines (SVMs)
• Decision Trees and Random Forests
• Neural networks
6
Unsupervised Learning
• The training data is unlabeled
• The system tries to learn without a teacher
Unsupervised Learning
• For example, say you have a lot of data about your blog’s visitors. You may want to
run a clustering algorithm to try to detect groups of similar visitors
7
Unsupervised Learning
• For example, detecting unusual credit card transactions to prevent fraud-anomaly
detection
Unsupervised Learning
• Clustering
• K-Means
• DBSCAN
• Hierarchical Cluster Analysis (HCA)
• Anomaly detection and novelty detection
• One-class SVM
• Isolation Forest
• Visualization and dimensionality reduction
• Principal Component Analysis (PCA)
• Kernel PCA
• Locally Linear Embedding (LLE)
• t-Distributed Stochastic Neighbor Embedding (t-SNE)
• Association rule learning
• Apriori
• Eclat
8
Reinforcement Learning
• The learning system
• observe the environment,
• select and perform actions, and
• get rewards in return (or penalties in the form of negative rewards)
• It must then learn by itself what is the best strategy, called a policy, to get the
most reward over time
Reinforcement Learning
9
Machine Learning Systems
Main Challenges of Machine Learning
• Bad data
• Insufficient Quantity of Training Data
• Nonrepresentative Training Data
• Poor-Quality Data
• Irrelevant Features
• Feature engineering
• Feature selection
• Feature extraction
• Creating new features by gathering new data
10
Main Challenges of Machine Learning
• Bad algorithms
• Overfitting the Training Data
• Underfitting the Training Data
Prepare the Data
• Acquiring the data
• Data cleaning
• Fix or remove outliers
• Fill in missing values (e.g., with zero, mean, median…) or drop their rows (or columns)
• Feature selection
• Drop the attributes that provide no useful information for the task
• Feature engineering
• Discretize continuous features
• Decompose features
• Aggregate features into promising new features
• Feature scaling
• Standardize or normalize features
11
Training data/ Validation data/ Test data
Training data/ Validation data/ Test data
12
Machine Learning projects
• Frame the problem and look at the big picture
• Get the data
• Explore the data
• Prepare the data to better expose the underlying data patterns to Machine Learning
algorithms
• Explore many different models and shortlist the best ones
• Fine-tune your models and combine them into a great solution
• Present your solution
• Launch, monitor, and maintain your system
13