Skit Learn Cheatsheet

Scikit-Learn is a Python library designed for data mining, data analysis, and machine learning, offering tools for data preprocessing, model training, evaluation, cross-validation, and hyperparameter tuning. Key functions include StandardScaler for feature standardization, LinearRegression for fitting linear models, and GridSearchCV for hyperparameter optimization. The library also supports creating pipelines to streamline multiple processing steps in machine learning workflows.

Uploaded by

saurabh kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views11 pages

Skit Learn Cheatsheet

Uploaded by

saurabh kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Scikit-Learn

Cheat Sheet

Simple tools for data mining, data

analysis, and machine learning

by
Numan
Scikit-Learn is a Python
library that provides
simple and efficient
tools for data mining,
data analysis, and
machine learning.
Data Preprocessing
sklearn.preprocessing.StandardScaler()
Standardizes features by removing the mean and
scaling to unit variance.

sklearn.preprocessing.MinMaxScaler()
Scales features to a given range, typically [0, 1].

sklearn.preprocessing.OneHotEncoder()
Converts categorical values into one-hot encoded
binary vectors.

sklearn.preprocessing.LabelEncoder()
Encodes labels with values between zero and the
number of classes minus one.

sklearn.impute.SimpleImputer()
Handles missing values by replacing them with specified
values (e.g., mean, median).
Test-Train Split

Splits arrays or matrices into random train and

test subsets.
sklearn.model_selection.train_test_split(
data,
test_size=0.2,
shuffle=True,
random_state=42,
)

Don’t forget to specify the random state, so that

the results are reproducible!
Model Training
sklearn.linear_model.LinearRegression()
Fits a linear model with coefficients to minimize the
residual sum of squares.

sklearn.linear_model.LogisticRegression()
Applies logistic regression for binary or multiclass
classification tasks.

sklearn.tree.DecisionTreeClassifier()
A decision tree classifier that uses a tree structure to
make predictions.

sklearn.ensemble.RandomForestClassifier()
A meta-estimator that fits a number of decision trees
on various sub-samples of the dataset.
Model Evaluation
sklearn.metrics.accuracy_score()
Calculates the accuracy classification score
(proportion of correct predictions).

sklearn.metrics.precision_score()
Measures precision; useful for binary classification to
assess the positive class.

sklearn.metrics.recall_score()
Measures recall, which is the ability of the classifier to
find all positive samples.

sklearn.metrics.f1_score()
Computes the F1 score, which balances precision and
recall.
Model Evaluation
sklearn.metrics.mean_absolute_error()
Computes the mean absolute error for regression
tasks.

sklearn.metrics.mean_squared_error()
Calculates the MSE regression loss, measuring how
close a regression line is to a set of data point.

sklearn.metrics.r2_score()
Calculates R squared - a regression performance
measure based on variance explained.
Cross-Validation

Evaluates a score by cross-validation on different

subsets of the data.

sklearn.model_selection.cross_val_score(
estimator=model,
X=X_train,
y=y_train,
cv=5, # splitting strategy
scoring='accuracy',
)

Learning the parameters of a prediction function

and testing it on the same data will lead to
overfitting.
Hyperparameter Tuning

Hyperparameter tuning is the process of

selecting the optimal values for a machine learning
model’s hyperparameters.

sklearn.model_selection.GridSearchCV()
Performs exhaustive search over specified
parameter values for an estimator.
Basically, brute-force search.

sklearn.model_selection.RandomizedSearchCV()
Randomly samples parameter settings. Uses
fewer resources than GridSearchCV.
Pipeline Creation

Use Pipeline to group multiple processing steps

together:

pipeline = sklearn.pipeline.Pipeline([
('scaler', StandardScaler()),
('classifier', RandomForestClassifier())
])

You can still use hyperparameter tunning on your

pipelines! As if they are models.
don’t forget to

Subscribe
Kostya Numan

Lec 04 05
No ratings yet
Lec 04 05
37 pages
Hyperparameter Tuning Mits
No ratings yet
Hyperparameter Tuning Mits
17 pages
Scikit-Learn Python Cheat Sheet
100% (1)
Scikit-Learn Python Cheat Sheet
1 page
DE - Python For Data Science - Machine Learning
No ratings yet
DE - Python For Data Science - Machine Learning
45 pages
LAB MANUAL For Machine Learning
No ratings yet
LAB MANUAL For Machine Learning
15 pages
ML Lab Record - 250625 - 105014
No ratings yet
ML Lab Record - 250625 - 105014
29 pages
Vtu ML
No ratings yet
Vtu ML
13 pages
Machine Learning: Engr. Ejaz Ahmad
No ratings yet
Machine Learning: Engr. Ejaz Ahmad
54 pages
Macine Resit
No ratings yet
Macine Resit
7 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
17 pages
Lecture 4
No ratings yet
Lecture 4
63 pages
Cheat Sheet Building Supervised Learning Models
No ratings yet
Cheat Sheet Building Supervised Learning Models
3 pages
Ds You Should Know
No ratings yet
Ds You Should Know
6 pages
Scikit Learn Cheat Sheet Python
No ratings yet
Scikit Learn Cheat Sheet Python
1 page
ML Lab Manual
No ratings yet
ML Lab Manual
19 pages
M2 - Supervised Machine Learning
No ratings yet
M2 - Supervised Machine Learning
79 pages
QB 1
No ratings yet
QB 1
11 pages
Final ML
No ratings yet
Final ML
2 pages
Machine Learning Cheat Sheet: Karn Singh
No ratings yet
Machine Learning Cheat Sheet: Karn Singh
13 pages
Machinelearning - Lab Manual
No ratings yet
Machinelearning - Lab Manual
26 pages
Machine Learning Data Prep Guide
No ratings yet
Machine Learning Data Prep Guide
17 pages
INT354 Syllabus
No ratings yet
INT354 Syllabus
2 pages
ML Lab Programs 2
No ratings yet
ML Lab Programs 2
16 pages
Unit 1
No ratings yet
Unit 1
11 pages
ML 2 PPT Unit 2
No ratings yet
ML 2 PPT Unit 2
214 pages
Mlviva
No ratings yet
Mlviva
14 pages
ML Lab Manual
No ratings yet
ML Lab Manual
14 pages
Week 7 Laboratory Activity
No ratings yet
Week 7 Laboratory Activity
12 pages
Model Evaluation and Selection Cheatsheet 1708023215
No ratings yet
Model Evaluation and Selection Cheatsheet 1708023215
7 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Data Mining Practicals
No ratings yet
Data Mining Practicals
22 pages
ML Algorithms
100% (1)
ML Algorithms
1 page
Python For Data Science Cheat Sheet: Scikit-Learn Create Your Model Evaluate Your Model's Performance
100% (1)
Python For Data Science Cheat Sheet: Scikit-Learn Create Your Model Evaluate Your Model's Performance
1 page
L03 The Regression Pipeline - 2
No ratings yet
L03 The Regression Pipeline - 2
58 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Linear Regression
No ratings yet
Linear Regression
6 pages
Lecture20 TuningHyperparametersAndPipelines
No ratings yet
Lecture20 TuningHyperparametersAndPipelines
9 pages
Machine Learning Cheat Sheet
No ratings yet
Machine Learning Cheat Sheet
15 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
ML Python
No ratings yet
ML Python
11 pages
Exam 2 Review
No ratings yet
Exam 2 Review
23 pages
Assignment1 LATEX
No ratings yet
Assignment1 LATEX
11 pages
Python Predictive Modeling
No ratings yet
Python Predictive Modeling
24 pages
ML in Python Part-2
No ratings yet
ML in Python Part-2
21 pages
Hyper Parameter Tuning
No ratings yet
Hyper Parameter Tuning
4 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
8 pages
Advanced Scikit Learn
No ratings yet
Advanced Scikit Learn
98 pages
Ds Module 4
No ratings yet
Ds Module 4
73 pages
Moocs Ritesh
No ratings yet
Moocs Ritesh
22 pages
08 CSE358 Intro To Machine Learning II
No ratings yet
08 CSE358 Intro To Machine Learning II
100 pages
Perform Prediction Using Regression Algorithm: Ex No: 1 Date
No ratings yet
Perform Prediction Using Regression Algorithm: Ex No: 1 Date
13 pages
StackGAN and AttnGAN
No ratings yet
StackGAN and AttnGAN
78 pages
SVM Dual Problem Notes
No ratings yet
SVM Dual Problem Notes
10 pages
Very Good For Transformer
No ratings yet
Very Good For Transformer
34 pages
Python Notes
No ratings yet
Python Notes
76 pages
Interview Question For Data Science
No ratings yet
Interview Question For Data Science
33 pages
SQL Project Handbook
No ratings yet
SQL Project Handbook
14 pages
What Is DBMS - Application, Types, Example, Advantages
No ratings yet
What Is DBMS - Application, Types, Example, Advantages
7 pages
DeploymentTemplates BOE 4.2
No ratings yet
DeploymentTemplates BOE 4.2
1 page
Openas2 Server Application
No ratings yet
Openas2 Server Application
48 pages
Cs614 Current Paper 2024
No ratings yet
Cs614 Current Paper 2024
10 pages
Knowledge Management in The Organization
100% (1)
Knowledge Management in The Organization
23 pages
A Systematic Review On Imbalanced Data Challenges in Machine Learning: Applications and Solutions
100% (1)
A Systematic Review On Imbalanced Data Challenges in Machine Learning: Applications and Solutions
36 pages
LAB 08 (Procedure, Functions, Views)
No ratings yet
LAB 08 (Procedure, Functions, Views)
8 pages
Introduction-to-Database-Management-Systems-DBMS For Beginners
No ratings yet
Introduction-to-Database-Management-Systems-DBMS For Beginners
11 pages
CBR for Accident Prevention in Construction
No ratings yet
CBR for Accident Prevention in Construction
45 pages
Wa0014.
No ratings yet
Wa0014.
6 pages
Technical Introduction To ARIS API
No ratings yet
Technical Introduction To ARIS API
20 pages
DBA Commands
100% (1)
DBA Commands
20 pages
Full Stack Developer Assignmnet - PanScience Innovations
No ratings yet
Full Stack Developer Assignmnet - PanScience Innovations
3 pages
SAP Memory Management
No ratings yet
SAP Memory Management
3 pages
Current Log
No ratings yet
Current Log
26 pages
Bhimdatta Profile Completion
No ratings yet
Bhimdatta Profile Completion
1 page
HANA System Replication Overview
No ratings yet
HANA System Replication Overview
7 pages
Oracle SQL Interview Questions
No ratings yet
Oracle SQL Interview Questions
7 pages
Very Short Notes of Dbms RGPV (CS502)
No ratings yet
Very Short Notes of Dbms RGPV (CS502)
17 pages
Improve The Efficiency of Apriori-Unit3
No ratings yet
Improve The Efficiency of Apriori-Unit3
2 pages
FAQ RXi Reports
No ratings yet
FAQ RXi Reports
9 pages
Basis Daily Monitoring T
No ratings yet
Basis Daily Monitoring T
15 pages
Huawei Switch Terminal System Guide
No ratings yet
Huawei Switch Terminal System Guide
18 pages
Hackathon Siet Problem Statements
No ratings yet
Hackathon Siet Problem Statements
5 pages
Data: ITEC 1010 Information and Organizations
No ratings yet
Data: ITEC 1010 Information and Organizations
9 pages
TOPS Interferometry Tutorial: Sentinel-1 Toolbox
No ratings yet
TOPS Interferometry Tutorial: Sentinel-1 Toolbox
25 pages
Introduction To Joins & Its Types Part 1
No ratings yet
Introduction To Joins & Its Types Part 1
2 pages
Database Guide for Polly Pipe
No ratings yet
Database Guide for Polly Pipe
47 pages