KEMBAR78
Machine Learning for Beginners | PDF | Machine Learning | Deep Learning
0% found this document useful (0 votes)
63 views18 pages

Machine Learning for Beginners

ml notes

Uploaded by

Naveen K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views18 pages

Machine Learning for Beginners

ml notes

Uploaded by

Naveen K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Machine Learning

Types of AI

Machine Learning
Types of ML

Machine Learning 1
Machine Learning 2
Supervised learning: The computer is provided with labeled training data and
learns to map inputs to outputs.

Unsupervised learning: The computer is provided with unlabeled data and


learns to find underlying structures or patterns in the data.

Reinforcement learning: The computer learns to make decisions in an


environment by receiving rewards or punishments for its actions.

Deep learning: A type of machine learning that involves training artificial


neural networks with multiple layers to learn complex patterns in data.

Supervised ML algos:
Linear regression

Logistic regression

Machine Learning 3
Decision tree

Support vector machine (SVM)

Naive Bayes

Linear discriminant analysis

K-nearest neighbors (KNN)

Neural networks

Random forest

Gradient boosting

XGBoost

Stochastic gradient descent

Adaptive boosting (AdaBoost)

Bagging

Classification and regression trees (CART)

Conditional random fields (CRF)

Gaussian processes (GP)

Hidden Markov models (HMM)

Kalman filter

Maximum entropy (MaxEnt)

Unsupervised Learning:
K means clustering

Hierarchy clustering

DBSCAN

GMM - Gussian Mixture Models

PCA - Principal component Analysis

Machine Learning 4
t-SNE

Associate Rule Learning (Apriori Learning)

Auto encoder

Self-organizing maps (SOM)

Reinforcement Learning:
Q-learning

Deep Q-network (QN)

Policy gradient methods

Actor critic methods (A2C)

Proximal Policy optimization (PPO)

Transfer Learning:

Pre-trained models, fine-tuning, domain adoption, Multi-task learning, Model


ensemble, one-shot learning

Deep Learning:

CNN, RNN, GANs, Auto-encoders, transformers, DBNs

Ensemble Learning:

Bagging, Boosting, Stacking, Voting

Terminologies:
Overfitting: Holds the exact pattern of data - couldn’t do well on test data

Underfitting: Didn’t hold the pattern - couldn’t predict well on test data

Machine Learning 5
Batch/offline ML: entire dataset is trained on Local Machine - deployed on Server

Online ML: Data will be dynamically feeded into model - Dynamic learning

Model Based ML: Draws the best fit line is the model

Instance based ML: Load all the data as model - calculate distance between test
data vs model data (training data) - Lazy Learner

MLDLC - ML Development Life Cycle


1. Frame the problem

2. Gather the data

3. Data Preprocessing

4. Exploratory Data Analysis (EDA)

5. Feature Engineering and Feature Selection

6. Model Training, Evaluation and selection

7. Model Deployment

8. Testing

9. Optimize

Machine Learning 6
1. Frame the problem:

2. Gather the data:

Machine Learning 7
Loading a CSV file

# Import necessary libraries


import pandas as pd

# Load data from csv file


data = pd.read_csv('filename.csv')

# Print the first 5 rows of the dataframe


print(data.head())

Collection of data from an API

# Import necessary libraries


import requests
import json

# Define the API endpoint


url = '<https://api.example.com/data>'

Machine Learning 8
# Send a GET request to the API
response = requests.get(url)

# Convert the response to JSON format


data = response.json()

# Print the data


print(json.dumps(data, indent=4))

https://youtu.be/roTZJaxjnJc?feature=shared

Web Scraping:

# Import necessary libraries


from bs4 import BeautifulSoup
import requests

# Specify url
url = '<https://www.example.com>'

# Send a GET request to the website


response = requests.get(url)

# Parse the html content


soup = BeautifulSoup(response.content, 'html.parser')

# Print out the parsed HTML


print(soup.prettify())

Machine Learning 9
https://youtu.be/8NOdgjC1988?feature=shared

From JSON/SQL

https://youtu.be/fFwRC-fapIU?feature=shared

3. Data Preprocessing

Structural Issues

Data from different sources - Not compatable

Remove Duplicates

Handle Missing Values

Outliers

Scale - Standardization or Normalization

Few General Operations:

df.shape()

df.head()
df.tail()
df.sample()

Machine Learning 10
df.isnull().sum()
df.dupliacted().sum()

df.describe() # High level maths


df.info() # Column details

df.corr()
df.corr()['Age']

Here are some of the operations we perform during data preprocessing, along
with their respective Python codes:

1. Removing Duplicates:

import pandas as pd

# Assuming df is your DataFrame


df = pd.read_csv('filename.csv')

# Removing duplicates
df = df.drop_duplicates()

2. Handling Missing Values:

# You can fill missing values with some value or median, mean
of the column
df = df.fillna(value)

# Or you can drop rows with missing values


df = df.dropna()

3. Handling Outliers:

# Assuming 'column' is a column in df with outliers


Q1 = df['column'].quantile(0.25)
Q3 = df['column'].quantile(0.75)

Machine Learning 11
IQR = Q3 - Q1

# Removing outliers
df = df[~((df['column'] < (Q1 - 1.5 * IQR)) |(df['column'] >
(Q3 + 1.5 * IQR)))]

4. Feature Scaling (Standardization):

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

# Assuming X is your features DataFrame


X = pd.DataFrame(scaler.fit_transform(X), columns = X.column
s)

5. Feature Scaling (Normalization):

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()

# Assuming X is your features DataFrame


X = pd.DataFrame(scaler.fit_transform(X), columns = X.column
s)

4. Exploratory Data Analysis (EDA):

“Study of relationship between INPUT and OUTPUT features”

“Getting Idea about the data”


“Experiment and Extract the relationships”

Machine Learning 12
Visualization

Univariate Analysis/ Bivariate Analysis

Outlier Detection

Data Imbalance

1. Visualization
the first question would be if data is numerical or categorical?

working on Categorical columns

1. Count plot

sns.countplot(df['survived'])

df['survived'].value_counts().plot(kind='bar')

2. Pie chart

df['survived'].value_counts().plot(kind='pie', autopct='%.2f')

working on Numerical columns:

1. Histogram

import matplotlib.pyplot as plt


plt.hist(df['Age'], bins=10) #bin - kinda ZOOMIN

Machine Learning 13
2. Distplot - pdf [probability Density Function]

sns.distplot(df['Age'])

3. Box plot

sns.boxplot(df['Age'])

2. Bivariate and Multi variate Analysis

1. Scatter plot [Num vs Num]

bivariate

sns.scatterplot(tips['totalbill'], tips['tip'])

Machine Learning 14
multivariate

sns.scatterplot(tips['total_bill'], tips['tip'], hue=df['sex])

sns.scatterplot(tips['total_bill'], tip['tip'], hue=df['sex'], s

# hue - change in color


# style - change in shape
# size - change in size

5. Feature Engineering and Selection:

Selecting Features

Merging columns

Machine Learning 15
Minimizing Columns - Time & Cost efficient

Gradient Descent

https://www.youtube.com/watch?v=qg4PchTECck&list=PLqwozWPBo-Fvu
HWx3_aYwG2WVdbb-wC6q&index=2

import numpy as np

# Initialize parameters
learning_rate = 0.01
num_iterations = 1000
m, theta = np.zeros(shape=(2,1)), 0

# Gradient Descent
for i in range(num_iterations):
prediction = m * X + theta
error = prediction - y

m = m - learning_rate * (1/n) * np.dot(X.T, error)


theta = theta - learning_rate * error.sum()

print("Gradient Descent has converged at m = ", m, ", theta =


", theta)

Linear Regression:
Use linear regression in machine learning when you have a continuous target
variable and want to model the linear relationship between input features and the

Machine Learning 16
target, making it suitable for predicting numerical outcomes.

https://www.youtube.com/watch?v=CtsRRUddV2s

# Import necessary libraries


import numpy as np
from sklearn.linear_model import LinearRegression

# Load dataset
dataset = np.loadtxt("[dataset_file_name]", delimiter=",")
X = dataset[:, 0:n_features] # X is a 2D array of feature data (
y = dataset[:, n_features:] # y is a 1D array of target data
# Train the model
regr = LinearRegression()
regr.fit(X, y)

# Make predictions
predictions = regr.predict(X)

# Evaluate model performance


[evaluation_metric] = regr.score(X, y)

Logistic regression:

https://www.youtube.com/watch?v=L_xBe7MbPwk

Unsupervised Learning

Machine Learning 17
PCA - Principal Component Analysis

https://www.youtube.com/watch?v=FD4DeN81ODY

Machine Learning 18

You might also like