0% found this document useful (0 votes)

8 views6 pages

Machine Learnine Experiment by Priyanka

ML Lab Manual

Uploaded by

priya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views6 pages

Machine Learnine Experiment by Priyanka

ML Lab Manual

Uploaded by

priya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 6

Let's address each of your questions with explanations and Python code where

applicable.

1. How to calculate important numbers based on data sets, how to use various Python
modules and how to make functions that are able to predict the outcome based on
what we have learned (Small Dataset)

For a small dataset, you can calculate various descriptive statistics and
potentially build simple predictive models. Here's how:

Calculating Important Numbers & Using Python Modules:

Pandas: For data manipulation and analysis.

NumPy: For numerical computations.
Statistics: For basic statistical functions.

import pandas as pd
import numpy as np
import statistics

# Sample small dataset (replace with your actual data)

data = {'Feature1': [1, 2, 3, 4, 5],
'Feature2': [2, 4, 1, 5, 3],
'Target': [0, 1, 0, 1, 0]}
df = pd.DataFrame(data)

# Descriptive Statistics
print("Descriptive Statistics:")
print(df.describe())

# Specific calculations
print("\nMean of Feature1:", df['Feature1'].mean())
print("Median of Feature2:", df['Feature2'].median())
print("Standard Deviation of Target:", df['Target'].std())

# Using NumPy
print("\nNumPy Array of Feature1:", np.array(df['Feature1']))

# Using statistics module

print("\nVariance of Feature2 (statistics):", statistics.variance(df['Feature2']))

output

Experiment no 1 on whatsup

Making Predictive Functions (Simple Example - based on a rule):

def simple_predictor(feature1_value, feature2_value):

"""A very simple predictor based on arbitrary rules."""
if feature1_value > 3 and feature2_value < 4:
return 1
else:
return 0

# Example prediction
new_feature1 = 4
new_feature2 = 2
prediction = simple_predictor(new_feature1, new_feature2)
print(f"\nPrediction for Feature1={new_feature1}, Feature2={new_feature2}:
{prediction}")

output

program 2 ----How can we get Big Data Sets, Learn: Data Distribution, Normal data
distribution, Random Data Distribution, Scatter Plot.

Getting Big Data Sets:

Public Datasets:
Kaggle Datasets: A vast collection of datasets for various machine learning tasks.
UCI Machine Learning Repository: Classic datasets for machine learning research.
Google Dataset Search: A search engine for publicly available datasets.
Government Open Data Portals: Many governments provide open data (e.g., data.gov in
the US, data.gov.in in India).
Academic Research Datasets: Researchers often make their data public.
APIs: Many companies and services provide APIs to access their data (e.g., Twitter
API, financial APIs).
Web Scraping: If data is publicly available on websites, you can use libraries like
Beautiful Soup and Scrapy (be mindful of website terms of service).
Data Generation: For specific purposes, you can generate synthetic big data using
libraries like NumPy or specialized tools.
Cloud Storage: Platforms like AWS S3, Google Cloud Storage, and Azure Blob Storage
often host large datasets.
Learning about Data Distribution:

Data Distribution: Describes how the values of a variable are spread out across its
range. Understanding the distribution is crucial for choosing appropriate
statistical methods and machine learning models.

Normal Data Distribution (Gaussian Distribution):

A bell-shaped, symmetrical distribution where most of the data points cluster

around the mean.
Characterized by its mean (μ) and standard deviation (σ).
Many natural phenomena tend to follow a normal distribution (e.g., height, weight).
In Python, you can visualize it using matplotlib.pyplot.hist() and generate normal
data using numpy.random.normal().
import numpy as np
import matplotlib.pyplot as plt

Program 3-----3. Build an Artificial Neural Network by implementing the

Backpropagation algorithm and test the same using appropriate data sets.

Implementing backpropagation from scratch is a significant task. Here's a

simplified conceptual outline and a basic Python implementation for a two-layer
network:

Program 4------4. The probability that it is Friday and that a student is absent is
3%. Since there are 5 school days in a week, the probability that it is Friday is
20%. What is the probability that a student is absent given that today is Friday?
Apply Bayes' rule in Python to get the result.
Let:

P(A∩F) be the probability that a student is absent AND it is Friday = 3% = 0.03

P(F) be the probability that it is Friday = 20% = 0.20
P(A∣F) be the probability that a student is absent GIVEN that it is Friday (what we
want to find)
Bayes' Rule states:
P(A∣F)=
P(F)
P(F∣A)P(A)

However, we are directly given P(A∩F), which is equal to P(F∣A)P(A). So, we can use
a simplified form:

P(A∣F)=
P(F)
P(A∩F)

Now, let's implement this in Python:

prob_absent_and_friday = 0.03
prob_friday = 0.20

prob_absent_given_friday = prob_absent_and_friday / prob_friday

print(f"The probability that a student is absent given that today is Friday is:
{prob_absent_given_friday:.2f}")

output screen short

The probability that a student is absent given that today is Friday is: 0.15
So, the probability that a student is absent given that today is Friday is 15%.

Program5........5. Write a program to implement k-Nearest Neighbour algorithm to

classify the iris data set.

Program9.........9.Write a program to demonstrate the working of the decision tree

based ID3 algorithm. Use an appropriate data set for building the decision tree and
apply this knowledge to classify a new sample.

import pandas as pd
import numpy as np
from collections import Counter
import math

class DecisionTreeID3:
def __init__(self, min_samples_split=2, max_depth=None):
self.min_samples_split = min_samples_split
self.max_depth = max_depth
self.root = None

def _entropy(self, s):

"""Calculates the entropy of a dataset."""
class_counts = Counter(s)
entropy = 0
for count in class_counts.values():
probability = count / len(s)
entropy -= probability * math.log2(probability)
return entropy

def _information_gain(self, dataset, feature, target):

"""Calculates the information gain of a feature."""
total_entropy = self._entropy(dataset[target])
weighted_entropy = 0
for value in dataset[feature].unique():
subset = dataset[dataset[feature] == value][target]
probability = len(subset) / len(dataset)
weighted_entropy += probability * self._entropy(subset)
return total_entropy - weighted_entropy

def _split_dataset(self, dataset, feature, value):

"""Splits the dataset based on a feature and its value."""
left_subset = dataset[dataset[feature] == value]
return left_subset

def _choose_best_feature(self, dataset, features, target):

"""Chooses the best feature to split on based on information gain."""
best_gain = -1
best_feature = None
for feature in features:
gain = self._information_gain(dataset, feature, target)
if gain > best_gain:
best_gain = gain
best_feature = feature
return best_feature

def _build_tree(self, dataset, features, target, depth=0):

"""Recursively builds the decision tree."""
if len(np.unique(dataset[target])) == 1:
return dataset[target].iloc[0] # Pure node, return the class

if len(dataset) < self.min_samples_split or not features or (self.max_depth

is not None and depth >= self.max_depth):
return Counter(dataset[target]).most_common(1)[0][0] # Return majority
class

best_feature = self._choose_best_feature(dataset, features, target)

if best_feature is None: # No information gain
return Counter(dataset[target]).most_common(1)[0][0]

tree = {best_feature: {}}

remaining_features = [f for f in features if f != best_feature]

for value in dataset[best_feature].unique():

subset = self._split_dataset(dataset, best_feature, value)
tree[best_feature][value] = self._build_tree(subset,
remaining_features, target, depth + 1)

return tree

def fit(self, dataset, target_column):

"""Fits the decision tree model to the training data."""
self.target_column = target_column
self.features = [col for col in dataset.columns if col != target_column]
self.root = self._build_tree(dataset, self.features, self.target_column)

def predict(self, sample):

"""Predicts the class label for a new sample."""
def traverse_tree(tree, sample):
feature = list(tree.keys())[0]
if feature not in sample:
# Handle missing feature in the sample (can return majority or
raise error)
return None # Or implement a more sophisticated handling

value = sample[feature]
if value not in tree[feature]:
# Handle unseen value (can return majority of the branch or None)
return None # Or implement a more sophisticated handling

subtree = tree[feature][value]
if isinstance(subtree, dict):
return traverse_tree(subtree, sample)
else:
return subtree
return traverse_tree(self.root, sample)

# --- Example Usage with a Simple Play Tennis Dataset ---

# Create a sample dataset

data = {
'Outlook': ['Sunny', 'Sunny', 'Overcast', 'Rainy', 'Rainy', 'Rainy',
'Overcast', 'Sunny', 'Sunny', 'Rainy', 'Sunny', 'Overcast', 'Overcast', 'Rainy'],
'Temperature': ['Hot', 'Hot', 'Hot', 'Mild', 'Cool', 'Cool', 'Cool', 'Mild',
'Cool', 'Mild', 'Mild', 'Mild', 'Hot', 'Mild'],
'Humidity': ['High', 'High', 'High', 'High', 'Normal', 'Normal', 'Normal',
'High', 'Normal', 'Normal', 'Normal', 'High', 'Normal', 'High'],
'Windy': [False, True, False, False, False, True, True, False, False, False,
True, True, False, True],
'PlayTennis': ['No', 'No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'No', 'Yes',
'Yes', 'Yes', 'Yes', 'Yes', 'No']
}
df = pd.DataFrame(data)

# Initialize and train the ID3 decision tree

id3_tree = DecisionTreeID3(min_samples_split=2, max_depth=None)
id3_tree.fit(df, 'PlayTennis')

# Print the learned decision tree (can be nested and hard to read for complex
trees)
print("Learned Decision Tree:")
import pprint
pprint.pprint(id3_tree.root)

# Classify a new sample

new_sample = {'Outlook': 'Sunny', 'Temperature': 'Cool', 'Humidity': 'High',
'Windy': False}
prediction = id3_tree.predict(new_sample)
print(f"\nPrediction for {new_sample}: {prediction}")
new_sample_2 = {'Outlook': 'Rainy', 'Temperature': 'Mild', 'Humidity': 'Normal',
'Windy': True}
prediction_2 = id3_tree.predict(new_sample_2)
print(f"Prediction for {new_sample_2}: {prediction_2}")

Machine Learning File
No ratings yet
Machine Learning File
28 pages
Ashwin Report
No ratings yet
Ashwin Report
18 pages
ML Lab Record
No ratings yet
ML Lab Record
33 pages
Python ML Algorithms Guide
No ratings yet
Python ML Algorithms Guide
7 pages
Exercise and Experiment 3
No ratings yet
Exercise and Experiment 3
14 pages
ML Practical 205160694034
No ratings yet
ML Practical 205160694034
33 pages
Machine Learning Laboratory Record Book: 1 Find S Algorithm
No ratings yet
Machine Learning Laboratory Record Book: 1 Find S Algorithm
22 pages
ML Book Notes
No ratings yet
ML Book Notes
9 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
Machine Learning Algorithms Lab
No ratings yet
Machine Learning Algorithms Lab
48 pages
Decision Trees
No ratings yet
Decision Trees
7 pages
AD3461 ML Lab Manual
No ratings yet
AD3461 ML Lab Manual
32 pages
MANUAL
No ratings yet
MANUAL
34 pages
Practical File Machine Learning
No ratings yet
Practical File Machine Learning
29 pages
AI and ML Lab Ex3 To 12
No ratings yet
AI and ML Lab Ex3 To 12
27 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
13 pages
Building Good Training Sets UNIT 1 PART2
No ratings yet
Building Good Training Sets UNIT 1 PART2
46 pages
Pattern Recognition
No ratings yet
Pattern Recognition
26 pages
ML Priyesha - 778
No ratings yet
ML Priyesha - 778
23 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
ML Lab PT
No ratings yet
ML Lab PT
25 pages
ML Manual
No ratings yet
ML Manual
53 pages
Indexdw
No ratings yet
Indexdw
34 pages
Machine Learning Evaluation Guide
100% (1)
Machine Learning Evaluation Guide
504 pages
AI&ML
No ratings yet
AI&ML
9 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
28 pages
ML Lab Manual Completed
No ratings yet
ML Lab Manual Completed
56 pages
Asset-V1 VIT+MBA109+2020+type@asset+block@Introductio To ML Using Python
No ratings yet
Asset-V1 VIT+MBA109+2020+type@asset+block@Introductio To ML Using Python
7 pages
ML Lab P-1
No ratings yet
ML Lab P-1
10 pages
MANUAL
No ratings yet
MANUAL
33 pages
Aiml Programs
No ratings yet
Aiml Programs
12 pages
Machine Learning Lab: Algorithms & Implementation
No ratings yet
Machine Learning Lab: Algorithms & Implementation
11 pages
Data Mining Lab Manual CSE VII Sem
No ratings yet
Data Mining Lab Manual CSE VII Sem
63 pages
Machine Learning Laboratory (BTCS619-18) B.Tech Cse 6Th 2024 EVEN
No ratings yet
Machine Learning Laboratory (BTCS619-18) B.Tech Cse 6Th 2024 EVEN
29 pages
221IT027 DA Lab3
No ratings yet
221IT027 DA Lab3
5 pages
Precision and Recall
No ratings yet
Precision and Recall
13 pages
ML Shristi File
No ratings yet
ML Shristi File
49 pages
Decision Trees & Neural Networks
No ratings yet
Decision Trees & Neural Networks
10 pages
Top 90+ Data Science Interview Questions and Answers (2024)
No ratings yet
Top 90+ Data Science Interview Questions and Answers (2024)
38 pages
CS178 Homework #1: Problem 0: Getting Connected
No ratings yet
CS178 Homework #1: Problem 0: Getting Connected
4 pages
MLWP LAB Experiment's
No ratings yet
MLWP LAB Experiment's
11 pages
MACHINE LEARNING Manual
No ratings yet
MACHINE LEARNING Manual
36 pages
Lab 3
No ratings yet
Lab 3
7 pages
Programs Lab Bca
No ratings yet
Programs Lab Bca
16 pages
Machine Learning Unit4
No ratings yet
Machine Learning Unit4
8 pages
Aiml Practical
No ratings yet
Aiml Practical
17 pages
Study Material For Machine Learning - 1 - 1754721598318
No ratings yet
Study Material For Machine Learning - 1 - 1754721598318
18 pages
Data Science and Machine Learning - Interview Questions
No ratings yet
Data Science and Machine Learning - Interview Questions
185 pages
Aml Lab
No ratings yet
Aml Lab
6 pages
Machine Learning Algorithms Are Generally Categorized Into Three Main Types
No ratings yet
Machine Learning Algorithms Are Generally Categorized Into Three Main Types
7 pages
ML Lab Manual
No ratings yet
ML Lab Manual
25 pages
24CSPC212-PIC Lab Manual
No ratings yet
24CSPC212-PIC Lab Manual
45 pages
ML Lab
No ratings yet
ML Lab
14 pages
Department of Computer Engineering Academic Term: June-Nov 2021
No ratings yet
Department of Computer Engineering Academic Term: June-Nov 2021
6 pages
Ds Notes Mca
No ratings yet
Ds Notes Mca
30 pages
P 4 Andp 5
No ratings yet
P 4 Andp 5
4 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
24 pages
Mllabprog 5
No ratings yet
Mllabprog 5
6 pages
S6 - Data Mining Lab Experiments (Except 1)
No ratings yet
S6 - Data Mining Lab Experiments (Except 1)
6 pages
Crafting A Strategy For Competitive Advantage 1
No ratings yet
Crafting A Strategy For Competitive Advantage 1
7 pages
1.1 Wood Is One of The Earliest Materials Used By: Words
No ratings yet
1.1 Wood Is One of The Earliest Materials Used By: Words
8 pages
Final Project
No ratings yet
Final Project
2 pages
Service Design and Innovation Guide
No ratings yet
Service Design and Innovation Guide
19 pages
T3804U
No ratings yet
T3804U
2 pages
Assignment On Cells and Stem Cells
No ratings yet
Assignment On Cells and Stem Cells
10 pages
Learner's Guide - Drilling & Blasting in Opencast
No ratings yet
Learner's Guide - Drilling & Blasting in Opencast
36 pages
Northern Bukidnon Community College: Manolo Fortich, Bukidnon Mobile # 09171426080
No ratings yet
Northern Bukidnon Community College: Manolo Fortich, Bukidnon Mobile # 09171426080
10 pages
Concept Note For Sensitization and Awareness For Sustainable Rainwater Harvesting Ok
No ratings yet
Concept Note For Sensitization and Awareness For Sustainable Rainwater Harvesting Ok
27 pages
Art Artifact
No ratings yet
Art Artifact
6 pages
Combustion Using Openfoam: Bits Pilani
No ratings yet
Combustion Using Openfoam: Bits Pilani
5 pages
A Mathematical Model For The Control of Cholera in
No ratings yet
A Mathematical Model For The Control of Cholera in
5 pages
SUBIECTE - LICEU Engleza
No ratings yet
SUBIECTE - LICEU Engleza
34 pages
Forget Developing Poor Countries Its Time To De-Develop Rich Countries by Jason Hickel
No ratings yet
Forget Developing Poor Countries Its Time To De-Develop Rich Countries by Jason Hickel
2 pages
ISI Admission Guide 2020-21
No ratings yet
ISI Admission Guide 2020-21
7 pages
Guidebook
No ratings yet
Guidebook
24 pages
1937 Vision of Joe Brandt
No ratings yet
1937 Vision of Joe Brandt
6 pages
Consumer Science Education Intro
No ratings yet
Consumer Science Education Intro
4 pages
Be Complete Marksheet
No ratings yet
Be Complete Marksheet
9 pages
B.tech Fresher RESUME
76% (49)
B.tech Fresher RESUME
2 pages
National Adaptation Plan - Fiji
No ratings yet
National Adaptation Plan - Fiji
124 pages
Architectural Programming and Analysis
No ratings yet
Architectural Programming and Analysis
53 pages
Machine Problem 15554
No ratings yet
Machine Problem 15554
2 pages
The Science Fiction Films of Andrei Tarkovsky PDF
No ratings yet
The Science Fiction Films of Andrei Tarkovsky PDF
14 pages
Unit IV
No ratings yet
Unit IV
2 pages
Barangay Primicias Green Plan
No ratings yet
Barangay Primicias Green Plan
3 pages
Hydrochloric Acid Inhibitor MSDS
No ratings yet
Hydrochloric Acid Inhibitor MSDS
4 pages
g11 Module Derivatives
No ratings yet
g11 Module Derivatives
3 pages
Group 2 Chapter 3 and Chapter 4 FINAL 1
No ratings yet
Group 2 Chapter 3 and Chapter 4 FINAL 1
27 pages
FLX SPEC GT250S 71000098 Rev C
No ratings yet
FLX SPEC GT250S 71000098 Rev C
2 pages

Machine Learnine Experiment by Priyanka

Uploaded by

Machine Learnine Experiment by Priyanka

Uploaded by

Let's address each of your questions with explanations and Python code where

Calculating Important Numbers & Using Python Modules:

Pandas: For data manipulation and analysis.

# Sample small dataset (replace with your actual data)

# Using statistics module

Making Predictive Functions (Simple Example - based on a rule):

def simple_predictor(feature1_value, feature2_value):

Getting Big Data Sets:

Normal Data Distribution (Gaussian Distribution):

A bell-shaped, symmetrical distribution where most of the data points cluster

Program 3-----3. Build an Artificial Neural Network by implementing the

Implementing backpropagation from scratch is a significant task. Here's a

P(A∩F) be the probability that a student is absent AND it is Friday = 3% = 0.03

Now, let's implement this in Python:

prob_absent_given_friday = prob_absent_and_friday / prob_friday

output screen short

Program5........5. Write a program to implement k-Nearest Neighbour algorithm to

Program9.........9.Write a program to demonstrate the working of the decision tree

def _entropy(self, s):

def _information_gain(self, dataset, feature, target):

def _split_dataset(self, dataset, feature, value):

def _choose_best_feature(self, dataset, features, target):

def _build_tree(self, dataset, features, target, depth=0):

if len(dataset) < self.min_samples_split or not features or (self.max_depth

best_feature = self._choose_best_feature(dataset, features, target)

tree = {best_feature: {}}

for value in dataset[best_feature].unique():

def fit(self, dataset, target_column):

def predict(self, sample):

# --- Example Usage with a Simple Play Tennis Dataset ---

# Create a sample dataset

# Initialize and train the ID3 decision tree

# Classify a new sample

You might also like