IU2041230140                                                              DS-B2
Practical-01
 Aim:-Introduction to Jupyter Notebook.
 Installation
 you can use a handy tool that comes with Python called pip to install
 Jupyter Notebook like this:
 $ pip install jupyter
 The next most popular distribution of Python is Anaconda.
 Starting the Jupyter Notebook Server
 open up your terminal application and go to a folder of your choice. go
 to that location in your terminal and run the following command:
 $ jupyter notebook
 This will start up Jupyter and your default browser should start (or open a
 new tab) to the following URL: http://localhost:8888/tree
 Your browser should now look something like this:
 right now you are not actually running a Notebook, but instead you are
 just running the Notebook server.
                                                                               1
IU2041230140                                                          DS-B2
 Creating a Notebook
 click on the New button (upper right), choose Python 3.
 Your web page should now look like this:
 Naming
 You will notice that at the top of the page is the word Untitled. let’s
 change it!
 Let’s try writing the code to the running cell:
 print('Hello Jupyter!'):
                                                                           2
IU2041230140                                                                  DS-B2
                                   Practical-02
Aim:-To Implement Python Basic Programs.
❖ Python program to print "Hello Python"
   1. print ('Hello Python')
Output: Hello World
❖ Python program to do arithmetical operations
❖
   1. num1 = input('Enter first number: ')
   2. num2 = input('Enter second number: ')
   3. sum = float(num1) + float(num2)
   4. min = float(num1) - float(num2)
   5. mul = float(num1) * float(num2)
   6. div = float(num1) / float(num2)
   7. print('The sum of {0} and {1} is {2}'.format(num1, num2, sum))
   8. print('The subtraction of {0} and {1} is {2}'.format(num1, num2, min))
   9. print('The multiplication of {0} and {1} is {2}'.format(num1, num2, mul))
   10. print('The division of {0} and {1} is {2}'.format(num1, num2, div))
Output:
Enter first number: 10
Enter second number: 20
The sum of 10 and 20 is 30.0
The subtraction of 10 and 20 is -10.0
The multiplication of 10 and 20 is 200.0
The division of 10 and 20 is 0.5
❖ Python program to find the area of a triangle
   1.   a = float(input('Enter first side: '))
   2.   b = float(inpu 'EnterseconDS-B2ide:'
   3.   c = float(i npu 'EnterthirDS-B2ide:'
   4.   s = (a + b + c) / 2
   5.   area = (s*(s-a)*(s-b)*(s-c)) ** 0.5
   6.   print('The area of the triangle is %0.2f' %area)
                                                                                  3
IU2041230140                                                                DS-B2
Output:
❖ Python program to solve quadratic equation
   1.   import cmath
   2.   a = float(input('Enter a: '))
   3.   b = float(input('Enter b: '))
   4.   c = float(input('Enter c: '))
   5.   d = (b**2) - (4*a*c)
   6.   sol1 = (-b-cmath.sqrt(d))/(2*a)
   7.   sol2 = (-b+cmath.sqrt(d))/(2*a)
   8.   print('The solution are {0} and {1}'.format(sol1,sol2))
Output:
Enter a: 8
Enter b: 5
Enter c: 9
The solution are (-0.3125-1.0135796712641785j) and (-0.3125+1.01357967126
❖ Python program to swap two variables
   1.   P = int( input("Please enter value for P: "))
   2.   Q = int( input("Please enter value for Q: "))
   3.   temp_1 = P
   4.   P=Q
   5.   Q = temp_1
   6.   print ("The Value of P after swapping: ", P)
   7.   print ("The Value of Q after swapping: ", Q)
Output:
Please enter value for P: 13
                                                                              4
IU2041230140                                                             DS-B2
Please enter value for Q: 43
The Value of P after swapping: 43
The Value of Q after swapping: 13
❖ Python program to generate a random number
   1. import random
   2. n = random.random()
   3. print(n)
Output:
0.7632870997556201
If we run the code again, we will get the different output as follows.
0.8053503984689108
Generating a Number within a Given Range
   1. import random
   2. n = random.randint(0,50)
   3. print(n)
Output:
40
❖ Python program to display calendar
   1. import calendar
   2. yy = int(input("Enter year: "))
   3. mm = int(input("Enter month: "))
   4. print(calendar.month(yy,mm))
Output:
       Enter year: 2022
       Enter month: 6
          June 2022
       Mo Tu We Th Fr Sa Su
           1 2 3 4 5
        6 7 8 9 10 11 12
       13 14 15 16 17 18 19
       20 21 22 23 24 25 26
       27 28 29 30
                                                                           5
IU2041230140                                                                         DS-B2
                                   Practical-03
Aim:-Study of various Machine Learning libraries.
>Python       libraries     that   are    used     in    Machine       Learning      are:
1.Numpy: NumPy is a very popular python library for large multi-dimensional array
and matrix processing, with the help of a large collection of high-level mathematical
functions. It is very useful for fundamental scientific computations in Machine
Learning. It is particularly useful for linear algebra, Fourier transform, and random
number capabilities. High-end libraries like TensorFlow uses NumPy internally for
manipulation                                 of                               Tensors.
import numpy as np
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]])
v = np.array([9, 10])
w = np.array([11, 12])
print(np.dot(v, w), "\n")
print(np.dot(x, v), "\n")
print(np.dot(x, y))
Output:
219
[29 67]
[[19 22]
[43 50]]
2.Pandas: Pandas is a popular Python library for data analysis. It is not directly related
to Machine Learning. As we know that the dataset must be prepared before training. In
this case, Pandas comes handy as it was developed specifically for data extraction and
analysis. It provides many inbuilt methoDS-B2 for grouping, combining and filtering data.
import pandas as pd
data = {"country": ["Brazil", "Russia", "India", "China", "South Africa"],
    "capital": ["Brasilia", "Moscow", "New Delhi", "Beijing", "Pretoria"],
                                                                                         6
IU2041230140                                                                         DS-B2
    "area": [8.516, 17.10, 3.286, 9.597, 1.221],
    "population": [200.4, 143.5, 1252, 1357, 52.98] }
data_table = pd.DataFrame(data)
print(data_table)
Output:
3.Matplotlib: Matplotlib is a very popular Python library for data visualization. Like
Pandas, it is not directly related to Machine Learning. It particularly comes in handy
when a programmer wants to visualize the patterns in the data. It is a 2D plotting library
used for creating 2D graphs and plots. A module named pyplot makes it easy for
formatting axes, etc. It provides various kinDS-B2 of graphs and plots for data visualization,
viz.,       histogram,          error        charts,      bar       chats,         etc,
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 100)
plt.plot(x, x, label ='linear')
plt.legend()
plt.show()
Output:
                                                                                         7
IU2041230140                                                                       DS-B2
4.TensorFlow: TensorFlow is a very popular open-source library for high performance
numerical computation developed by the Google Brain team in Google. As the name
suggests, Tensorflow is a framework that involves defining and running computations
involving tensors. It can train and run deep neural networks that can be used to develop
several AI applications. TensorFlow is widely used in the field of deep learning
research and application.
import tensorflow as tf
x1 = tf.constant([1, 2, 3, 4])
x2 = tf.constant([5, 6, 7, 8])
result = tf.multiply(x1, x2)
sess = tf.Session()
print(sess.run(result))
sess.close()
Output:
[ 5 12 21 32]
5.Keras It provides many inbuilt methoDS-B2 for groping, combining and filtering data.
Keras is a very popular Machine Learning library for Python. It is a high-level neural
networks API capable of running on top of TensorFlow, CNTK, or Theano. It can run
seamlessly on both CPU and GPU. Keras makes it really for ML beginners to build and
design a Neural Network. One of the best thing about Keras is that it allows for easy
and fast prototyping.
                                                                                       8
IU2041230140                                                                         DS-B2
6.PyTorch: PyTorch is a popular open-source Machine Learning library for Python
based on Torch, which is an open-source Machine Learning library that is implemented
in C with a wrapper in Lua. It has an extensive choice of tools and libraries that support
Computer Vision, Natural Language Processing(NLP), and many more ML programs.
It allows developers to perform computations on Tensors with GPU acceleration and
also helps in creating computational graphs.
import torch
dtype = torch.float
device = torch.device("cpu")
 N, D_in, H, D_out = 64, 1000, 100, 10
x = torch.random(N, D_in, device=device, dtype=dtype)
y = torch.random(N, D_out, device=device, dtype=dtype)
w1 = torch.random(D_in, H, device=device, dtype=dtype)
w2 = torch.random(H, D_out, device=device, dtype=dtype)
learning_rate = 1e-6
for t in range(500):
      h = x.mm(w1)
   h_relu = h.clamp(min=0)
   y_pred = h_relu.mm(w2)
  loss = (y_pred - y).pow(2).sum().item()
  print(t, loss)
  grad_y_pred = 2.0 * (y_pred - y)
  grad_w2 = h_relu.t().mm(grad_y_pred)
  grad_h_relu = grad_y_pred.mm(w2.t())
  grad_h = grad_h_relu.clone()
  grad_h[h < 0] = 0
  grad_w1 = x.t().mm(grad_h)
  w1 -= learning_rate * grad_w1
  w2 -= learning_rate * grad_w2
Output:
0 47168344.0
1 46385584.0
2 43153576.0
...
...
...
497 3.987660602433607e-05
                                                                                        9
  IU2041230140                                                                       DS-B2
  498 3.945609932998195e-05
  499 3.897604619851336e-05
  7.SciPy: SciPy is a very popular library among Machine Learning enthusiasts as it
  contains different modules for optimization, linear algebra, integration and statistics.
  There is a difference between the SciPy library and the SciPy stack. The SciPy is one
  of the core packages that make up the SciPy stack. SciPy is also very useful for image
  manipulation.
  from scipy.misc import imread, imsave, imresize
  img = imread('D:/Programs / cat.jpg') # path of the image
  print(img.dtype, img.shape)
  img_tint = img * [1, 0.45, 0.3]
  imsave('D:/Programs / cat_tinted.jpg', img_tint)
  img_tint_resize = imresize(img_tint, (300, 300))
  imsave('D:/Programs / cat_tinted_resized.jpg', img_tint_resize)
If scipy.misc import imread, imsave,imresize does not work on your operating system
then try below code instead to proceed with above code
  !pip install imageio
  import imageio
  from imageio import imread, imsave
  Original image:
  Tinted image:
                                                                                       10
IU2041230140                                                                        DS-B2
Resized tinted image:
8.Scikit-learn:Scikit-learn is one of the most popular ML libraries for classical ML
algorithms. It is built on top of two basic Python libraries, viz., NumPy and SciPy.
Scikit-learn supports most of the supervised and unsupervised learning algorithms.
Scikit-learn can also be used for data-mining and data-analysis, which makes it a great
tool who is starting out with ML.
from sklearn import datasets
from sklearn import metrics
from sklearn.tree import DecisionTreeClassifier
dataset = datasets.load_iris()
model = DecisionTreeClassifier()
model.fit(dataset.data, dataset.target)
print(model)
expected = dataset.target
predicted = model.predict(dataset.data)
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))
Output:
DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,
                        max_features=None, max_leaf_nodes=None,
                    min_impurity_decrease=0.0,
                                     min_impurity_split=None,
                        min_samples_leaf=1, min_samples_split=2,
                                     min_weight_fraction_leaf=0.0, presort=False,
                                     random_state=None, splitter='best')
            precision    recall f1-score support
       0     1.00     1.00    1.00        50
                                                                                     11
IU2041230140                                                                          DS-B2
       1     1.00       1.00      1.00       50
       2     1.00       1.00      1.00       50
 micro avg       1.00      1.00       1.00        150
 macro avg       1.00      1.00       1.00        150
weighted avg        1.00       1.00      1.00      150
[[50 0 0]
[ 0 50 0]
[ 0 0 50]]
9.Theano: We all know that Machine Learning is basically mathematics and statistics.
Theano is a popular python library that is used to define, evaluate and optimize
mathematical expressions involving multi-dimensional arrays in an efficient manner. It
is achieved by optimizing the utilization of CPU and GPU. It is extensively used for
unit-testing and self-verification to detect and diagnose different types of errors. Theano
is a very powerful library that has been used in large-scale computationally intensive
scientific projects for a long time but is simple and approachable enough to be used by
individuals                for               their             own                projects.
import theano
import theano.tensor as T
x = T.dmatrix('x')
s = 1 / (1 + T.exp(-x))
logistic = theano.function([x], s)
logistic([[0, 1], [-1, -2]])
Output:
array([[0.5, 0.73105858],
[0.26894142, 0.11920292]])
                                                                                        12
IU2041230140                                                                   DS-B2
                               Practical-04
Aim:-Introduction to GitHub Repository.
What GIT is about?
Git is a free and open-source distributed version control system designed to
handle everything from small to very large projects with speed and efficiency.
Git relies on the basis of distributed development of software where more than
one developer may have access to the source code of a specific application and
can modify changes to it that may be seen by other developers.
Initially designed and developed by Linus TorvalDS-B2 for Linux kernel
development in 2005.
Every git working directory is a full-fledged repository with complete history
and full version tracking capabilities, independent of network access or a central
server.
Git allows a team of people to work together, all using the same files. And it
helpsthe team cope with the confusion that tenDS-B2 to happen when multiple
people are editing the same files.
How does GIT work?
A Git repository is a key-value object store where all objects are indexed by their
SHA-1 hash value.
All commits, files, tags, and filesystem tree nodes are different types of objects
living in this repository.
A Git repository is a large hash table with no provision made for hash collisions.
Git specifically works by taking “snapshots” of files.
   ● Let’s us see how to host to a local repository to Github, from very
     beginning(creating a github account).
A. Creating a GitHub Account
                                                                                13
IU2041230140                                                                  DS-B2
Step 1: Go to github.com and enter the required user credentials asked on the site
and then click on the SignUp for GitHub button.
Step 2: Choose a plan that best suits you. The following plans are available as
shown in below media as depicted:
Step 3: Then Click on Finish Sign Up.
The account has been created. The user is automatically redirected to your
Dashboard.
                                                                                  14
IU2041230140                                                              DS-B2
B. Creating a new Repository
   ● Login to your Github account
   ● On the dashboard click on the Green Button starting New repository.
   ● Make sure to verify the Github account by going into the mail which was
      provided when creating the account.
   ● Once verification has been done, the following screen comes
C. Start by giving a repository name, description(optional) and select the
visibility and accessibility mode for the repository
D. Click on Create repository
E. The repository (in this case ITE-304 is the repository) is now created. The
repository can be created looks like:
                                                                            15
IU2041230140       DS-B2
And here you go…
                    16
IU2041230140                                           DS-B2
                             Practical-05
Aim:-Download the data set and perform the analysis.
CODE:-
from google.colab import files
file = files.upload()
import pandas as pd
df = pd.read_csv('StudentsPerformance.csv')
df.head()
# Show last 5 rows in a DataFrame
df.tail()
# Show last n rows in a DataFrame
n = 10
df.tail(n)
                                                        17
IU2041230140                               DS-B2
# Getting access to the shape attribute
df.shape
(1000,8)
# Getting access to the index attribute
df.index
RangeIndex(start=0, stop=1000, step=1)
# Getting access to the column attribute
df.loc[:,"gender"]
# df.iloc[:, 5]
# Data types of each column
df.dtypes
                                            18
IU2041230140                     DS-B2
df.info()
print(f"Count : {df.count()}")
print(f"Mean : {df.mean()}")
print(f"SD : {df.std()}")
print(f"Max : {df.max()}")
print(f"Min : {df.min()}")
                                  19
IU2041230140                DS-B2
df.count()
df['math score'].idxmax()
149
df['math score'].idxmin()
59
df.round()
                             20
IU2041230140                                 DS-B2
df['math score']
df.loc[: , ["gender","math score"]]
df.loc[: , ["gender","math score"]].dtypes
                                              21
IU2041230140                                                  DS-B2
import numpy as np
df['Language Score'] = np.random.randint(100,size = (1000))
df
df["Average Score"]=(df["math score"].mean()+df["reading
score"].mean()+df["writing score"].mean())/3
df.head()
df['math score'].sort_values(ascending = True)
# Sort the MathScore in decending order
df['math score'].sort_values(ascending = False)
                                                               22
IU2041230140   DS-B2
                23
IU2041230140                                                 DS-B2
                               Practical-06
Aim:-Write a program to implement Linear Regression.
CODE:-
import numpy as np
import matplotlib.pyplot as plt
def estimate_coef(x, y):
        # number of observations/points
        n = np.size(x)
       # mean of x and y vector
       m_x = np.mean(x)
       m_y = np.mean(y)
       # calculating cross-deviation and deviation about x
       SS_xy = np.sum(y*x) - n*m_y*m_x
       SS_xx = np.sum(x*x) - n*m_x*m_x
       # calculating regression coefficients
       b_1 = SS_xy / SS_xx
       b_0 = m_y - b_1*m_x
       return (b_0, b_1)
def plot_regression_line(x, y, b):
       # plotting the actual points as scatter plot
       plt.scatter(x, y, color = "m",
                       marker = "o", s = 30)
       # predicted response vector
       y_pred = b[0] + b[1]*x
       # plotting the regression line
       plt.plot(x, y_pred, color = "g")
       # putting labels
       plt.xlabel('x')
       plt.ylabel('y')
       # function to show plot
       plt.show()
                                                              24
IU2041230140                                           DS-B2
def main():
      # observations / data
      x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
      y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])
       # estimating coefficients
       b = estimate_coef(x, y)
       print("Estimated coefficients:\nb_0 = {} \
              \nb_1 = {}".format(b[0], b[1]))
       # plotting regression line
       plot_regression_line(x, y, b)
if __name__ == "__main__":
       main()
OUTPUT:-
                                                        25
IU2041230140                                                               DS-B2
                               Practical-07
Aim:-Write a program to implement K-Nearest Neighbors.
CODE:
# Import necessary modules
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
import numpy as np
import matplotlib.pyplot as plt
irisData = load_iris()
# Create feature and target arrays
X = irisData.data
y = irisData.target
# Split into training and test set
X_train, X_test, y_train, y_test = train_test_split(
   X, y, test_size = 0.2, random_state=42)
neighbors = np.arange(1, 9)
train_accuracy = np.empty(len(neighbors))
test_accuracy = np.empty(len(neighbors))
# Loop over K values
for i, k in enumerate(neighbors):
 knn = KNeighborsClassifier(n_neighbors=k)
 knn.fit(X_train, y_train)
 # Compute training and test data accuracy
 train_accuracy[i] = knn.score(X_train, y_train)
 test_accuracy[i] = knn.score(X_test, y_test)
# Generate plot
plt.plot(neighbors, test_accuracy, label = 'Testing dataset Accuracy')
plt.plot(neighbors, train_accuracy, label = 'Training dataset Accuracy')
                                                                            26
IU2041230140                DS-B2
plt.legend()
plt.xlabel('n_neighbors')
plt.ylabel('Accuracy')
plt.show()
OUTPUT:
                             27
IU2041230140                                                          DS-B2
                                 Practical-08
Aim:-Write a program for Automatic grouping of similar objects
into sets.
CODE:-
from itertools import groupby
test_list = [ADITYA', 'coder_2', 'KIRTAN', 'coder_3', 'pro_3']
test_list.sort()
print ("The original list is : " + str(test_list))
res = [list(i) for j, i in groupby(test_list,
     lambda a: a.split('_')[0])]
print ("The grouped list is : " + str(res))
from itertools import groupby
test_list = [' ADITYA ', 'coder_2', ' KIRTAN ', 'coder_3', 'pro_3']
test_list.sort()
print ("The original list is : " + str(test_list))
res = [list(i) for j, i in groupby(test_list,
    lambda a: a.partition('_')[0])]
print ("The grouped list is : " + str(res))
                                                                       28
IU2041230140                                                                   DS-B2
test_list = ['geek_1', 'coder_2', 'geek_4', 'coder_3', 'pro_3']
print("The original List is : "+ str(test_list))
x=[]
for i in test_list:
 x.append(i[:i.index("_")])
x=list(set(x))
res=[]
for i in x:
 a=[]
 for j in test_list:
  if(j.find(i)!=-1):
    a.append(j)
 res.append(a)
# printing result
print ("The grouped list is : " + str(res))
test_list = ['geek_1', 'coder_2', 'geek_4', 'coder_3', 'pro_3']
print("The original list is : " + str(test_list))
res = [[item for item in test_list if item.startswith(prefix)] for prefix in
set([item[:item.index("_")] for item in test_list])]
print("The grouped list is : " + str(res))
                                                                                29
IU2041230140                                                      DS-B2
test_list = ['geek_1', 'coder_2', 'geek_4', 'coder_3', 'pro_3']
grouped = {}
for s in test_list:
 prefix = s.split('_')[0]
 if prefix not in grouped:
  grouped[prefix] = []
 grouped[prefix].append(s)
res = list(grouped.values())
print(res)
test_list = ['geek_1', 'coder_2', 'geek_4', 'coder_3', 'pro_3']
d = {}
for s in test_list:
 key = s.split('_')[0]
 if key in d:
  d[key].append(s)
 else:
  d[key] = [s]
res = list(d.values())
print("The original list is : " + str(test_list))
print("The grouped list is : " + str(res))
                                                                   30