0% found this document useful (0 votes)

12 views9 pages

ML Lab Manual With Statistical Formulas

The document outlines a series of experiments aimed at installing Python and essential libraries for machine learning, performing mathematical operations with NumPy, and using Pandas for CSV file handling. It includes procedures for statistical calculations and visualizations using Matplotlib. Each experiment provides code examples and expected outputs to verify successful execution.

Uploaded by

19057cme009

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views9 pages

ML Lab Manual With Statistical Formulas

Uploaded by

19057cme009

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Experiment 1: Installing Python and Required Packages

Aim: To install Python and essential machine learning libraries: NumPy, Pandas, Matplotlib, and
Scikit-learn.

Theory: Python is widely used for data science and machine learning because of its simple syntax
and vast ecosystem of libraries. Instead of coding everything from scratch, we use prebuilt libraries
like:

 NumPy: for numerical computations

 Pandas: for data handling and manipulation

 Matplotlib: for data visualization

 Scikit-learn: for implementing machine learning models

Software Requirements:

 Anaconda Distribution (Python 3.x)

 Internet Connection

 Jupyter Notebook / VS Code / Any Python IDE

Procedure:

1. Open a browser and visit the official Anaconda website:

https://www.anaconda.com/products/distribution

2. Download the latest version of Anaconda for your OS (Windows/macOS/Linux).

3. Run the downloaded installer and follow the steps:

- Accept license agreement

- Select 'Just Me'

- Choose installation location (default or custom)

– Enable PATH and register as default Python interpreter

- Click Install

4. Once installed, open Anaconda Navigator from the Start menu.

5. Launch Jupyter Notebook.

6. Inside Jupyter, click on New > Python 3 to start a new notebook.

7. Install the required libraries (if not preinstalled) using Anaconda Prompt:

conda install numpy pandas matplotlib scikit-learn

OR using pip:

pip install numpy pandas matplotlib scikit-learn

8. In the notebook, type and run the following code to verify installation:
Code:

import numpy

import pandas

import matplotlib

import sklearn

print("All packages are installed and imported successfully.")

Expected Output:

All packages are installed and imported successfully.

Output: When you run the code, it should display the above message without any errors.

Result: Successfully installed Python and verified the working of essential ML libraries: NumPy,
Pandas, Matplotlib, and Scikit-learn.

Experiment 2: Mathematical Operations on Vectors and Matrices

Aim: To perform mathematical operations such as addition, subtraction, multiplication, and

transpose on vectors and matrices using NumPy.

Theory: Matrices and vectors form the backbone of machine learning models. NumPy is a powerful
library that simplifies numerical computations and allows efficient matrix manipulations.

Software Requirements:

 Python with NumPy installed

 Jupyter Notebook or Python IDE

Procedure:

1. Import the NumPy library.

2. Create arrays representing vectors and matrices.

3. Perform operations like

4.
1. Matrix addition
2. Matrix multiplication
3. Transpose
4. Scalar multiplication

Code:

import numpy as np

# Define matrices
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
# Operations
print("Matrix A:\n", A)
print("Matrix B:\n", B)
print("Addition:\n", A + B)
print("Dot Product:\n", np.dot(A, B))
print("Transpose of A:\n", A.T)

Expected Output:

Matrix A:
[[1 2]
[3 4]]
Matrix B:
[[5 6]
[7 8]]
Addition:
[[ 6 8]
[10 12]]
Dot Product:
[[19 22]
[43 50]]
Transpose of A:
[[1 3]
[2 4]]

Output: Displays the result of matrix operations performed using NumPy.

Result: Successfully executed vector and matrix operations using NumPy.

Experiment 3: Creating, Loading, and Saving CSV Files

Aim: To create, load, and save datasets using CSV (Comma-Separated Values) files with
Python Pandas.

Theory: CSV is a simple file format used to store tabular data. Pandas provides powerful
tools for reading from and writing to CSV files, which are commonly used in machine
learning workflows for storing datasets.

Software Requirements:

 Python
 Pandas

Procedure:

1. Import the Pandas library.

2. Create a DataFrame from scratch.
3. Save the DataFrame to a CSV file.
4. Load the CSV file back into a DataFrame.
5. Display and manipulate the data.

Code:
import pandas as pd

# Step 1: Create data

data = {

'Name': ['Alice', 'Bob', 'Charlie'],

'Age': [25, 30, 35],

'Department': ['IT', 'HR', 'Finance']

# Step 2: Create DataFrame

df = pd.DataFrame(data)

print("Original DataFrame:")

print(df)

# Step 3: Save to CSV

df.to_csv('employees.csv', index=False)

print("

Data saved to employees.csv")

# Step 4: Load from CSV

loaded_df = pd.read_csv('employees.csv')

print("

Loaded DataFrame:")

print(loaded_df)

Expected Output:
Printed original and loaded DataFrames

Output: Console output showing the original data and data reloaded from the CSV file.

Result: Successfully created, saved, and loaded a CSV file using Pandas.

Experiment 4: Calculations of Mean, Median, Variance, Standard Deviation, Quartiles, and IQR

Aim: To perform basic statistical calculations such as mean, median, variance, standard deviation,
quartiles, and interquartile range using Python.

Theory: These statistical measures help in understanding the distribution and spread of data.

 Mean: Average value

 Median: Middle value in sorted data

 Variance: Measure of spread around the mean

 Standard Deviation: Square root of variance

 Quartiles: Divide data into four parts

 Interquartile Range (IQR): Difference between Q3 and Q1, shows spread of the middle 50%

Software Requirements:

 Python

 Pandas / NumPy

Procedure:

1. Import necessary libraries.

2. Create a sample dataset.

3. Calculate mean, median, variance, and standard deviation.

4. Calculate quartiles and IQR.

Code:

import pandas as pd

import numpy as np

# Sample data

data = [12, 15, 20, 21, 22, 25, 27, 30, 33, 35]
# Convert to Series

df = pd.Series(data)

# Calculations

print("Mean:", df.mean())

print("Median:", df.median())

print("Variance:", df.var())

print("Standard Deviation:", df.std())

print("Q1 (25th percentile):", df.quantile(0.25))

print("Q2 (50th percentile):", df.quantile(0.50))

print("Q3 (75th percentile):", df.quantile(0.75))

print("Interquartile Range (IQR):", df.quantile(0.75) - df.quantile(0.25))

Expected Output:

 Printed results of all statistical metrics.

Output: Displays the calculated values for mean, median, variance, standard deviation, quartiles, and
IQR.

Result: Successfully calculated statistical measures using Python.

Experiment 5: Basic Plots using Matplotlib for an Example Dataset

Aim: To visualize data using basic plots like line plot, bar chart, histogram, and scatter plot using
Matplotlib.

Theory: Matplotlib is a widely-used Python library for 2D plotting. Visualizing data helps in
understanding patterns, distributions, and relationships. Key types of plots include:

 Line Plot: Shows trends over time.

 Bar Chart: Compares categorical data.

 Histogram: Displays frequency distribution.

 Scatter Plot: Shows relationship between two variables.

Software Requirements:

 Python

 Matplotlib

 Pandas (optional for dataset handling)

Procedure:
1. Import required libraries.

2. Prepare or load example dataset.

3. Create basic visualizations using matplotlib.

Code:

import matplotlib.pyplot as plt

import numpy as np

# Sample data

x = [1, 2, 3, 4, 5]

y = [10, 20, 15, 25, 30]

# Line plot

plt.figure(figsize=(6,4))

plt.plot(x, y, marker='o')

plt.title("Line Plot")

plt.xlabel("X-axis")

plt.ylabel("Y-axis")

plt.grid(True)

plt.show()

# Bar chart

plt.bar(x, y)

plt.title("Bar Chart")

plt.xlabel("X")

plt.ylabel("Values")

plt.show()

# Histogram

data = [12, 15, 20, 20, 21, 22, 25, 27, 30, 33, 35, 35, 35]

plt.hist(data, bins=5, color='skyblue')

plt.title("Histogram")
plt.xlabel("Values")

plt.ylabel("Frequency")

plt.show()

# Scatter plot

x2 = np.random.rand(50)

y2 = np.random.rand(50)

plt.scatter(x2, y2, color='green')

plt.title("Scatter Plot")

plt.xlabel("X")

plt.ylabel("Y")

plt.show()

Expected Output:

 Multiple plots: line, bar, histogram, and scatter plot

Output: Visualizations appear in separate windows or cells (in Jupyter)

Result: Successfully created various plots to visualize data using Matplotlib.

Appendix: Basic Statistical Formulas (For Diploma Students)

1. Mean (µ):
µ = (x■ + x■ + ... + x■) / n

2. Median:
If n is odd → Middle value of sorted data
If n is even → Median = (middle1 + middle2) / 2

3. Mode:
The value that appears most frequently in the dataset.

4. Variance (σ²):
σ² = (1/n) * Σ(x■ - µ)²

5. Standard Deviation (σ):

σ = √((1/n) * Σ(x■ - µ)²)

These formulas help you understand the spread and average behavior of data.
They are useful in preprocessing and model evaluation in Machine Learning.

ML Lab Manual
No ratings yet
ML Lab Manual
28 pages
ML Programs
No ratings yet
ML Programs
41 pages
Sandeep ML Record
No ratings yet
Sandeep ML Record
31 pages
Lab Manual ML R22
No ratings yet
Lab Manual ML R22
27 pages
ML Record - Merged
No ratings yet
ML Record - Merged
29 pages
Experiment 1 To 4
No ratings yet
Experiment 1 To 4
15 pages
ML Lab Manual
No ratings yet
ML Lab Manual
21 pages
ML Manual
No ratings yet
ML Manual
21 pages
ML Lab Manual Completed
No ratings yet
ML Lab Manual Completed
56 pages
Machine Learning Lab Word 12-1-2025. Document
No ratings yet
Machine Learning Lab Word 12-1-2025. Document
68 pages
AI/ML Python Modules
No ratings yet
AI/ML Python Modules
17 pages
r22 1 9 ML Lab Manual r22 Regulations
No ratings yet
r22 1 9 ML Lab Manual r22 Regulations
24 pages
ML Manual New
No ratings yet
ML Manual New
38 pages
ML Lab - Manual
No ratings yet
ML Lab - Manual
15 pages
FDS Final Manual
No ratings yet
FDS Final Manual
41 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
ML Lab Mala Reddy CLG
No ratings yet
ML Lab Mala Reddy CLG
23 pages
DAL Lab Manual
No ratings yet
DAL Lab Manual
46 pages
CO-367 Machine Learning Lab File: Submitted To: Submitted by
No ratings yet
CO-367 Machine Learning Lab File: Submitted To: Submitted by
12 pages
ML Lab Manual (Upto Cie-1)
No ratings yet
ML Lab Manual (Upto Cie-1)
33 pages
ML Lab Manual
No ratings yet
ML Lab Manual
37 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
Smec ML Lab Manual R22
No ratings yet
Smec ML Lab Manual R22
21 pages
Python in Research
No ratings yet
Python in Research
18 pages
Exp No. 1-3 (MLC)
No ratings yet
Exp No. 1-3 (MLC)
12 pages
Machine Learning Lab File: Submitted To: Submitted by
No ratings yet
Machine Learning Lab File: Submitted To: Submitted by
9 pages
Dsa Lab Record (Ai&Ds)
No ratings yet
Dsa Lab Record (Ai&Ds)
34 pages
Fds Lab Record
No ratings yet
Fds Lab Record
84 pages
Roadmap
No ratings yet
Roadmap
27 pages
MCP Lab-2023 ContentForPythonLibrariesTopic
No ratings yet
MCP Lab-2023 ContentForPythonLibrariesTopic
9 pages
ML File Syllabus
No ratings yet
ML File Syllabus
43 pages
Fdsa Lab Manual Final
No ratings yet
Fdsa Lab Manual Final
70 pages
ML Lab (R22) Manual
No ratings yet
ML Lab (R22) Manual
25 pages
ML Lab File
No ratings yet
ML Lab File
43 pages
Data Science
No ratings yet
Data Science
42 pages
Machine Learning Lab Guide
No ratings yet
Machine Learning Lab Guide
36 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
FDS Record-1-4
No ratings yet
FDS Record-1-4
18 pages
ML Lab Manual
No ratings yet
ML Lab Manual
12 pages
Ankit Python
No ratings yet
Ankit Python
26 pages
Data Analytics Lab Manual
No ratings yet
Data Analytics Lab Manual
26 pages
Fds Merged
No ratings yet
Fds Merged
102 pages
ML3 Data Analysis
No ratings yet
ML3 Data Analysis
80 pages
Data Sci
No ratings yet
Data Sci
10 pages
Data Analysis Lab with Python
No ratings yet
Data Analysis Lab with Python
11 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
43 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
16 pages
Pandas Numpy
No ratings yet
Pandas Numpy
4 pages
ML Lab Manual
No ratings yet
ML Lab Manual
59 pages
FDS Lab 1 Manuel .1..1new
No ratings yet
FDS Lab 1 Manuel .1..1new
34 pages
Ad3411 - Dsa Lab Manual
No ratings yet
Ad3411 - Dsa Lab Manual
34 pages
De&v Lab Manual
No ratings yet
De&v Lab Manual
91 pages
Lab Manual
No ratings yet
Lab Manual
7 pages
Essential: Python
No ratings yet
Essential: Python
16 pages
R22 ML Lab Manual
No ratings yet
R22 ML Lab Manual
25 pages
CS3362 Data Science Laboratory Manual 2022-23
No ratings yet
CS3362 Data Science Laboratory Manual 2022-23
54 pages
Python Programming Basics
No ratings yet
Python Programming Basics
4 pages
Essential Python
No ratings yet
Essential Python
16 pages
1 Edge Computing
No ratings yet
1 Edge Computing
1 page
Untitled Document
No ratings yet
Untitled Document
19 pages
3 Digital Twins
No ratings yet
3 Digital Twins
1 page
5 3D Printing (Additive Manufacturing)
No ratings yet
5 3D Printing (Additive Manufacturing)
1 page
Mid Exam ML Units1 2
No ratings yet
Mid Exam ML Units1 2
1 page
Machine Learning Mid1 Model Paper Final
No ratings yet
Machine Learning Mid1 Model Paper Final
1 page
Diploma QP For Mid2 - Removed
No ratings yet
Diploma QP For Mid2 - Removed
1 page
ML NLP Lab Manual 1 To 5 FinalClean.....
No ratings yet
ML NLP Lab Manual 1 To 5 FinalClean.....
4 pages
Expanded Likelihood Explanation
No ratings yet
Expanded Likelihood Explanation
2 pages
Android Mid1 QuestionPaper SinglePage
No ratings yet
Android Mid1 QuestionPaper SinglePage
1 page
Explain Learning Model
No ratings yet
Explain Learning Model
2 pages
DBMS Mid2 QuestionPaper Aligned
No ratings yet
DBMS Mid2 QuestionPaper Aligned
1 page
Mid 1 2nd Ds
No ratings yet
Mid 1 2nd Ds
2 pages
Bayesian Learning
No ratings yet
Bayesian Learning
3 pages
Research Topics
No ratings yet
Research Topics
15 pages
Nuland Dusseldorp Martens Boekaerts 2010
No ratings yet
Nuland Dusseldorp Martens Boekaerts 2010
11 pages
2 Lom
No ratings yet
2 Lom
12 pages
CSIR NET Physical Sciences Syllabus
No ratings yet
CSIR NET Physical Sciences Syllabus
4 pages
Math-9-LAW #4-Q1-4
No ratings yet
Math-9-LAW #4-Q1-4
8 pages
Inverse Fire Modeling for HRR Analysis
No ratings yet
Inverse Fire Modeling for HRR Analysis
12 pages
Polygon Area Formulas for Math Majors
No ratings yet
Polygon Area Formulas for Math Majors
9 pages
Flow Passed Immersed Bodies: Outline
No ratings yet
Flow Passed Immersed Bodies: Outline
22 pages
Math Set-1
No ratings yet
Math Set-1
9 pages
Icmr 2021 Template
No ratings yet
Icmr 2021 Template
3 pages
AAD Lec04
No ratings yet
AAD Lec04
3 pages
Lesson 2. Measures of Central Tendency
No ratings yet
Lesson 2. Measures of Central Tendency
9 pages
Digital Circuit Timing Essentials
No ratings yet
Digital Circuit Timing Essentials
9 pages
Goldberger Watson Collision Theory
No ratings yet
Goldberger Watson Collision Theory
2 pages
Grade 9 Unit 7
No ratings yet
Grade 9 Unit 7
5 pages
Sec CSC Graphs
No ratings yet
Sec CSC Graphs
46 pages
Solved ISRO Scientist or Engineer Civil 2013 Paper With Solutions
No ratings yet
Solved ISRO Scientist or Engineer Civil 2013 Paper With Solutions
21 pages
CASE STUDY ANALYSIS FORM - Ivey v3
No ratings yet
CASE STUDY ANALYSIS FORM - Ivey v3
5 pages
MATLAB for Thermal Coating Analysis
No ratings yet
MATLAB for Thermal Coating Analysis
12 pages
CS3102 Algorithm Course Overview
No ratings yet
CS3102 Algorithm Course Overview
59 pages
Health Statistics Study Guide
No ratings yet
Health Statistics Study Guide
13 pages
Q3 - WS - Mathematics 7 - Lesson 3 - Week 3
No ratings yet
Q3 - WS - Mathematics 7 - Lesson 3 - Week 3
12 pages
Algebraic Reasoning & Postulates Guide
No ratings yet
Algebraic Reasoning & Postulates Guide
10 pages
General Mathematics 2nd Quarter Exam
80% (5)
General Mathematics 2nd Quarter Exam
3 pages
Pakistan Academy School Al-Ahmadi Kuwait Winter Vacation - 2015-Home Work Igcse-F
No ratings yet
Pakistan Academy School Al-Ahmadi Kuwait Winter Vacation - 2015-Home Work Igcse-F
6 pages
Um Cat1000ps e
100% (1)
Um Cat1000ps e
356 pages
Discrete Structure PDF
No ratings yet
Discrete Structure PDF
189 pages
Excel Session - Final - For - Attendees
No ratings yet
Excel Session - Final - For - Attendees
156 pages
Learning To Operate An Electric Vehicle Charging Station Considering Vehicle-Grid Integration
No ratings yet
Learning To Operate An Electric Vehicle Charging Station Considering Vehicle-Grid Integration
11 pages
Evolution of Physics (Albert - Einstein)
No ratings yet
Evolution of Physics (Albert - Einstein)
255 pages

ML Lab Manual With Statistical Formulas

Uploaded by

ML Lab Manual With Statistical Formulas

Uploaded by

Experiment 1: Installing Python and Required Packages

 NumPy: for numerical computations

 Pandas: for data handling and manipulation

 Matplotlib: for data visualization

 Scikit-learn: for implementing machine learning models

 Anaconda Distribution (Python 3.x)

 Jupyter Notebook / VS Code / Any Python IDE

1. Open a browser and visit the official Anaconda website:

2. Download the latest version of Anaconda for your OS (Windows/macOS/Linux).

3. Run the downloaded installer and follow the steps:

- Accept license agreement

- Select 'Just Me'

- Choose installation location (default or custom)

– Enable PATH and register as default Python interpreter

4. Once installed, open Anaconda Navigator from the Start menu.

5. Launch Jupyter Notebook.

6. Inside Jupyter, click on New > Python 3 to start a new notebook.

conda install numpy pandas matplotlib scikit-learn

pip install numpy pandas matplotlib scikit-learn

print("All packages are installed and imported successfully.")

All packages are installed and imported successfully.

Experiment 2: Mathematical Operations on Vectors and Matrices

Aim: To perform mathematical operations such as addition, subtraction, multiplication, and

 Python with NumPy installed

 Jupyter Notebook or Python IDE

1. Import the NumPy library.

2. Create arrays representing vectors and matrices.

3. Perform operations like

Output: Displays the result of matrix operations performed using NumPy.

Result: Successfully executed vector and matrix operations using NumPy.

Experiment 3: Creating, Loading, and Saving CSV Files

1. Import the Pandas library.

# Step 1: Create data

'Name': ['Alice', 'Bob', 'Charlie'],

'Age': [25, 30, 35],

'Department': ['IT', 'HR', 'Finance']

# Step 2: Create DataFrame

# Step 3: Save to CSV

Data saved to employees.csv")

# Step 4: Load from CSV

 Mean: Average value

 Median: Middle value in sorted data

 Variance: Measure of spread around the mean

 Standard Deviation: Square root of variance

 Quartiles: Divide data into four parts

1. Import necessary libraries.

2. Create a sample dataset.

3. Calculate mean, median, variance, and standard deviation.

4. Calculate quartiles and IQR.

print("Standard Deviation:", df.std())

print("Q1 (25th percentile):", df.quantile(0.25))

print("Q2 (50th percentile):", df.quantile(0.50))

print("Q3 (75th percentile):", df.quantile(0.75))

print("Interquartile Range (IQR):", df.quantile(0.75) - df.quantile(0.25))

 Printed results of all statistical metrics.

Result: Successfully calculated statistical measures using Python.

Experiment 5: Basic Plots using Matplotlib for an Example Dataset

 Line Plot: Shows trends over time.

 Bar Chart: Compares categorical data.

 Histogram: Displays frequency distribution.

 Scatter Plot: Shows relationship between two variables.

 Pandas (optional for dataset handling)

2. Prepare or load example dataset.

3. Create basic visualizations using matplotlib.

import matplotlib.pyplot as plt

y = [10, 20, 15, 25, 30]

plt.hist(data, bins=5, color='skyblue')

plt.scatter(x2, y2, color='green')

 Multiple plots: line, bar, histogram, and scatter plot

Output: Visualizations appear in separate windows or cells (in Jupyter)

Result: Successfully created various plots to visualize data using Matplotlib.

5. Standard Deviation (σ):

You might also like