Python Programming
Python Programming
1. Installation of Python -:
   a. Install Python IDLE.
   b. Add to Path variable.
   c. Go to CMD.
   d. python -v.
   e. pip -v.
   f. pip install numpy.
   g. Check if installed -pip show numpy.
   h. pip install pandas.
   i. pip show pandas.
   Output -:
    Enter a number: 54
   54 s not a prime number.
   Output -:
   Enter the first string: Hello
   Enter the second string: World
   Concatenated String: Hello World
                            ASSGNMENT -02
1. Write a program to add two matrixes’ manually.
   Source Code -:
   def get_matrix(size):
      matrix = []
      print("Enter the elements row-wise:")
      for i in range(size):
         row = []
         for j in range(size):
             row.append(int(input(f"Element [{i}][{j}]: ")))
         matrix.append(row)
      return matrix
   def add_matrices(matrix1, matrix2):
      size = len(matrix1)
      result = [[0 for _ in range(size)] for _ in range(size)]
      for i in range(size):
         for j in range(size):
             result[i][j] = matrix1[i][j] + matrix2[i][j]
      return result
   def print_matrix(matrix):
      for row in matrix:
         print(row)
   size = int(input("Enter the size of the square matrices: "))
   print("Enter elements for the first matrix:")
   matrix1 = get_matrix(size)
   print("Enter elements for the second matrix:")
   matrix2 = get_matrix(size)
   result = add_matrices(matrix1, matrix2)
   print("The sum of the matrices is:")
   print_matrix(result)
   Output -:
   Enter the size of the square matrices: 2
   Enter elements for the first matrix:
   Enter the elements row-wise:
   Element [0][0]: 1
   Element [0][1]: 2
   Element [1][0]: 3
   Element [1][1]: 4
   Enter elements for the second matrix:
   Enter the elements row-wise:
   Element [0][0]: 5
   Element [0][1]: 6
   Element [1][0]: 7
   Element [1][1]: 8
   The sum of the matrices is:
   [6, 8]
   [10, 12]
   Output -:
   Enter the number of rows for the first matrix: 2
   Enter the number of columns for the first matrix: 2
   Enter the elements of a 2x2 matrix row by row:
   12
   34
   Enter the number of rows for the second matrix: 2
   Enter the number of columns for the second matrix: 2
   Enter the elements of a 2x2 matrix row by row:
   45
   67
   Resultant Matrix:
   [16, 19]
   [36, 43]
   Output -:
   [9, 16, 21]
   [24, 25, 24]
   [21, 16, 9]
                          ASSGNMENT -03
1. Program -: Checking validates of Python Libraries -:
   a. sys b. scipy c. numpy d. matplotlib e. pandas f. sklearn
   a. sys
      Source code -:
      import sys
      print('Python:{}'.format(sys.version))
      Output -:
      Python:3.12.6
   b. scipy
      Source code -:
        import scipy
      print('Scipy:{}'.format(scipy.__version__))
      Output -:
      Scipy:1.13.1
   c. numpy
      Source code -:
      import numpy
      print('Numpy:{}'.format(numpy.__version__))
       Output -:
       Numpy:1.26.4
   d. matplotlib
      Source code -:
      import matplotlib
      print('Matplotlib:{}'.format(matplotlib.__version__))
       Output -:
       Matplotlib:3.9.1
   e. pandas
      Source Code -:
      import pandas
      print('Pandas:{}'.format(pandas.__version__))
      Output -:
      Pandas:2.2.2
   f. sklearn
      Source Code -:
      import sklearn
      print('Sklearn:{}'.format(sklearn.__version__))
      Output -:
      Sklearn:1.5.1
   Output -:
   2D Array:
   [[1 2 3]
    [4 5 6]
   [7 8 9]]
   Output -:
   Matrix 1:
   [[1 2 3]
    [4 5 6]
    [7 8 9]]
   Matrix 2:
   [[9 8 7]
    [6 5 4]
    [3 2 1]]
   Output -:
   Matrix A:
   [[1 2]
    [3 4]]
   Matrix B:
   [[5 6]
    [7 8]]
   Matrix A * Matrix B:
   [[19 22]
    [43 50]]
   Output -:
   Numpy array:
   [[1. 0. 0. 0.]
    [0. 1. 0. 0.]
    [0. 0. 1. 0.]
    [0. 0. 0. 1.]]
   COO representation:
    (0, 0)   1.0
    (1, 1)   1.0
    (2, 2)   1.0
    (3, 3)   1.0
                           ASSGNMENT -04
1. Write a program to implement KNN model and to plot first feature and second feature of
   iris dataset.
   Source Code -:
   import numpy as np
   import pandas as pd
   import mglearn
   from sklearn.model_selection import train_test_split
   from sklearn.neighbors import KNeighborsClassifier
   from sklearn.datasets import load_iris
   import matplotlib.pyplot as plt
   #generate dataset
   X,y=mglearn.datasets.make_forge()
   #plot dataset
mglearn.discrete_scatter(X[:,0], X[:, 1], y)
plt.legend(["class 0", "class 1"], loc=4)
plt.xlabel("first feature:")
plt.ylabel("2nd feature:")
print(X,y)
print("X.shape:{}".format(X.shape))
plt.show()
Output -:
[[ 9.96346605 4.59676542]
 [11.0329545 -0.16816717]
 [11.54155807 5.21116083]
 [ 8.69289001 1.54322016]
 [ 8.1062269 4.28695977]
 [ 8.30988863 4.80623966]
 [11.93027136 4.64866327]
 [ 9.67284681 -0.20283165]
 [ 8.34810316 5.13415623]
 [ 8.67494727 4.47573059]
 [ 9.17748385 5.09283177]
 [10.24028948 2.45544401]
 [ 8.68937095 1.48709629]
 [ 8.92229526 -0.63993225]
 [ 9.49123469 4.33224792]
 [ 9.25694192 5.13284858]
 [ 7.99815287 4.8525051 ]
 [ 8.18378052 1.29564214]
 [ 8.7337095 2.49162431]
 [ 9.32298256 5.09840649]
 [10.06393839 0.99078055]
 [ 9.50048972 -0.26430318]
 [ 8.34468785 1.63824349]
 [ 9.50169345 1.93824624]
 [ 9.15072323 5.49832246]
 [11.563957 1.3389402 ]] [1 0 1 0 0 1 1 0 1 1 1 1 0 0 1 1 1 0 0 1 0 0 0 0 1 0]
X.shape:(26, 2)
2. Write a program of the characterise of the dataset including its key, shape, class and
   features.
   Source Code -:
   import mglearn
   import matplotlib.pyplot as plt
   X,y=mglearn.datasets.make_wave(n_samples=40)
   plt.plot(X, y, 'o')
   plt.ylim(-3, 3)
   plt.xlabel("Feature")
   plt.ylabel("Target")
   plt.show()
   Output -:
3. Write a program of the characterise of breast cancer dataset including its key shape ,
   class, features.
   Source Code -:
   import numpy as np
   from sklearn.datasets import load_breast_cancer
   cancer = load_breast_cancer()
   print("cancer,keys(): \n{}".format(cancer.keys()))
   print("Shape of cancer data: {}".format(cancer.data.shape))
   print("Sample count per class: \n{}".format({n: v for n, v in zip(cancer.target_names,
   np.bincount(cancer.target))}))
   print("Feature name: \n{}".format(cancer.feature_names))
   Output-:
   cancer,keys():
   dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename',
   'data_module'])
   Shape of cancer data: (569, 30)
   Sample count per class:
   {'malignant': 212, 'benign': 357}
   Feature name:
   ['mean radius' 'mean texture' 'mean perimeter' 'mean area'
    'mean smoothness' 'mean compactness' 'mean concavity'
    'mean concave points' 'mean symmetry' 'mean fractal dimension'
    'radius error' 'texture error' 'perimeter error' 'area error'
    'smoothness error' 'compactness error' 'concavity error'
    'concave points error' 'symmetry error' 'fractal dimension error'
    'worst radius' 'worst texture' 'worst perimeter' 'worst area'
    'worst smoothness' 'worst compactness' 'worst concavity'
    'worst concave points' 'worst symmetry' 'worst fractal dimension']
4. Write a program to show how KNN classification work with different numbers of
   neighbours.
   Source Code -:
   import mglearn
   from sklearn.datasets import fetch_california_housing
   import matplotlib.pyplot as plt
   housing = fetch_california_housing()
   print("Data shape: {}".format(housing.data.shape))
   X,y = mglearn.datasets.load_extended_boston()
   print("X.shape:{}".format(X.shape))
   #mglearn.plots.plot_knn_classification(n_neighbors=1)
   mglearn.plots.plot_knn_classification(n_neighbors=3)
   plt.show()
   Output -:
   Data shape: (20640, 8)
   X.shape:(506, 104)
5. Write a program to plot training accuracy and test accuracy and regression on dataset of
   breast cancer.
   Source Code -:
   from sklearn.datasets import load_breast_cancer
   from sklearn.model_selection import train_test_split
   from sklearn.neighbors import KNeighborsClassifier
   import matplotlib.pyplot as plt
   import mglearn
   cancer = load_breast_cancer()
   X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target,
   stratify=cancer.target, random_state=66)
   training_accuracy = []
   test_accuracy = []
   neighbors_settings = range(1, 11)
   for n_neighbors in neighbors_settings:
      clf = KNeighborsClassifier(n_neighbors=n_neighbors)
      clf.fit(X_train, y_train)
      training_accuracy.append(clf.score(X_train, y_train))
      test_accuracy.append(clf.score(X_test, y_test))
   plt.plot(neighbors_settings, training_accuracy, label="Training accuracy")
   plt.plot(neighbors_settings, test_accuracy, label="Test accuracy")
   plt.ylabel("Accuracy")
   plt.xlabel("n_neighbors")
plt.legend()
mglearn.plots.plot_knn_regression(n_neighbors=1)
mglearn.plots.plot_knn_regression(n_neighbors=3)
plt.show()
Output -:
6. Write a program to implement KNN Regressor and also plot it.
   Source Code -:
   from sklearn.neighbors import KNeighborsRegressor
   from sklearn.model_selection import train_test_split
   import matplotlib.pyplot as plt
   import numpy as np
   import mglearn
   X, y = mglearn.datasets.make_wave(n_samples=40)
   X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
   plt.plot(X_train, y_train, 'o')
   plt.plot(X_test, y_test, '+')
   plt.ylim(-3, 3)
   plt.xlabel("Feature")
   plt.ylabel("Target")
   reg = KNeighborsRegressor(n_neighbors=3)
   reg.fit(X_train, y_train)
   print("Test Set Predictions:\n", reg.predict(X_test))
   print("Test Set R^2: {:.2f}".format(reg.score(X_test, y_test)))
   plt.show()
   Output -:
   Test Set Predictions:
    [-0.05396539 0.35686046 1.13671923 -1.89415682 -1.13881398 -1.63113382
     0.35686046 0.91241374 -0.44680446 -1.13881398]
   Test Set R^2: 0.83
7. Write a program to implement KNN Regressor and also plot it and also find its train test
   result.
   Source Code -:
   from sklearn.neighbors import KNeighborsRegressor
   from sklearn.model_selection import train_test_split
   import matplotlib.pyplot as plt
   import numpy as np
   import mglearn
   X, y = mglearn.datasets.make_wave(n_samples=40)
   X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
   plt.plot(X_train, y_train, 'o')
   plt.plot(X_test, y_test, '+')
   plt.ylim(-3, 3)
   plt.xlabel("Feature")
   plt.ylabel("Target")
   reg = KNeighborsRegressor(n_neighbors=3)
   reg.fit(X_train, y_train)
   print("Test Set Predictions:\n", reg.predict(X_test))
   print("Test Set R^2: {:.2f}".format(reg.score(X_test, y_test)))
   fig, axes = plt.subplots(1, 3, figsize=(15, 4))
   line = np.linspace(-3, 3, 1000).reshape(-1, 1)
   for n_neighbors, ax in zip([1, 3, 9], axes):
      reg = KNeighborsRegressor(n_neighbors=n_neighbors)
      reg.fit(X_train, y_train)
      ax.plot(line, reg.predict(line))
      ax.plot(X_train, y_train, 'o',c=mglearn.cm2(0), markersize=8)
      ax.plot(X_test, y_test, '+',c=mglearn.cm2(1), markersize=8)
      ax.set_title(f"n_neighbors = {n_neighbors}\nTrain score: {reg.score(X_train,
   y_train):.2f}\nTest score: {reg.score(X_test, y_test):.2f}")
      ax.set_ylabel("Target")
   axes[0].legend(["Model prediction","Training data/target","Test data/target"],loc="best")
   plt.show()
   Output -:
   Test Set Predictions:
    [-0.05396539 0.35686046 1.13671923 -1.89415682 -1.13881398 -1.63113382
 0.35686046 0.91241374 -0.44680446 -1.13881398]
Test Set R^2: 0.83
                           ASSGNMENT -05
1. Write a program using Gaussian NB to find accuracy score, confusion matrix ,actual value ,
   predicted value, F1 score.
   Source Code -:
   from sklearn.datasets import make_classification
   from sklearn.model_selection import train_test_split
   from sklearn.naive_bayes import GaussianNB
   from sklearn.metrics import accuracy_score, confusion_matrix, ConfusionMatrixDisplay,
   f1_score
   import matplotlib.pyplot as plt
   X, y = make_classification(
      n_features=6,
      n_classes=3,
      n_samples=800,
      n_informative=2,
      random_state=1,
      n_clusters_per_class=1
   )
   plt.scatter(X[:, 0], X[:, 1], c=y, marker="*")
   X_train, X_test, y_train, y_test = train_test_split(
      X, y, test_size=0.33, random_state=125
   )
   model = GaussianNB()
   model.fit(X_train, y_train)
   predicted = model.predict([X_test[6]])
   print("Actual Value:", y_test[6])
   print("Predicted Value:", predicted[0])
   y_pred = model.predict(X_test)
   accuracy = accuracy_score(y_pred, y_test)
   f1 = f1_score(y_pred, y_test, average="weighted")
   print("Accuracy:", accuracy)
   print("F1 Score:", f1)
   labels = [0, 1, 2]
   cm = confusion_matrix(y_test, y_pred, labels=labels)
   disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=labels)
   disp.plot()
   plt.show()
   Output -:
   Actual Value: 0
   Predicted Value: 0
   Accuracy: 0.8484848484848485
   F1 Score: 0.8491119695890328
                           ASSGNMENT -06
1. Write a program to implement SVM in dataset and also plot it .
   Source Code-:
   from sklearn.datasets import make_blobs
   import mglearn
import matplotlib.pyplot as plt
from sklearn.svm import LinearSVC
import numpy as np
X, y = make_blobs(centers=4, random_state=8)
y=y%2
mglearn.discrete_scatter(X[:, 0], X[:, 1], y)
plt.xlabel("Feature 0")
plt.ylabel("Feature 1")
linear_svm = LinearSVC().fit(X, y)
mglearn.plots.plot_2d_separator(linear_svm, X)
X_new = np.hstack([X, X[:,1:]**2])
figure = plt.figure()
ax = figure.add_subplot(projection='3d', elev=-152, azim=-26)
mask = y ==0
ax.scatter(X_new[mask, 0], X_new[mask, 1], X_new[mask, 2], c='b',label='Class 0' , s=60)
ax.scatter(X_new[~mask, 0], X_new[~mask, 1], X_new[~mask, 2], c='r',
marker='^',label='Class 1' , s=60)
ax.set_xlabel("Feature 0")
ax.set_ylabel("Feature 1")
ax.set_zlabel("Feature 1 **2")
ax.legend()
plt.show()
Output -:
                           ASSGNMENT -07
1. Write a program to implement KNN Classifier by taking any dataset.
   Source Code -:
   from sklearn.neighbors import KNeighborsClassifier
   from sklearn.model_selection import train_test_split
   import numpy as np
   import pandas as pd
   url = "https://raw.githubusercontent.com/plotly/datasets/master/timeseries.csv"
   dataset = pd.read_csv(url)
   print(dataset.head())
   X = dataset.drop(columns=['Date', 'G'])
   Y = dataset['G']
   Y_binned = pd.cut(Y, bins=3, labels=[0, 1, 2])
   X_train, X_test, Y_train, Y_test = train_test_split(X, Y_binned, random_state=0)
   knn = KNeighborsClassifier(n_neighbors=1)
   knn.fit(X_train, Y_train)
   X_new = pd.DataFrame([[5, 2, 3, 1, 4, 6]], columns=X.columns)
   prediction = knn.predict(X_new)
   print(f"Prediction: {prediction}")
   y_pred = knn.predict(X_test)
   print(f"Test set predictions: \n{y_pred}")
   print(f"Test set score: {knn.score(X_test, Y_test):.2f}")
   Output -:
        Date        A      B       C      D       E     F      G
   0 2008-03-18 24.68      164.93   114.73   26.27   19.21   28.87   63.44
   1 2008-03-19 24.18      164.89   114.75   26.22   19.07   27.76   59.98
   2 2008-03-20 23.99      164.63   115.04   25.78   19.01   27.04   59.61
   3 2008-03-25 24.14      163.92   114.85   27.41   19.61   27.84   59.41
   4 2008-03-26 24.44      163.45   114.84   26.86   19.53   28.02   60.09
   Prediction: [1]
   Test set predictions:
   [1 0 1]
   Test set score: 1.00
                            ASSGNMENT -08
1. Write a program to implement SVM in dataset and also plot it in 3D with hyperplane.
   Source Code-:
   from sklearn.datasets import make_blobs
   import mglearn
   import matplotlib.pyplot as plt
   from sklearn.svm import LinearSVC
   import numpy as np
   X, y = make_blobs(centers=4, random_state=8)
   y=y%2
   mglearn.discrete_scatter(X[:, 0], X[:, 1], y)
   plt.xlabel("Feature 0")
   plt.ylabel("Feature 1")
   linear_svm = LinearSVC().fit(X, y)
   mglearn.plots.plot_2d_separator(linear_svm, X)
   X_new = np.hstack([X, X[:,1:]**2])
   figure = plt.figure()
   ax = figure.add_subplot(projection='3d', elev=-152, azim=-26)
   mask = y ==0
   ax.scatter(X_new[mask, 0], X_new[mask, 1], X_new[mask, 2], c='b',label='Class 0' , s=60)
   ax.scatter(X_new[~mask, 0], X_new[~mask, 1], X_new[~mask, 2], c='r',
   marker='^',label='Class 1' , s=60)
   print(mask)
   print(~mask)
   ax.set_xlabel("Feature 0")
   ax.set_ylabel("Feature 1")
   ax.set_zlabel("Feature 1 **2")
   linear_svm_3d = LinearSVC().fit(X_new, y)
   coef, intercept = linear_svm_3d.coef_.ravel(), linear_svm_3d.intercept_
   figure = plt.figure()
   ax = figure.add_subplot(projection='3d', elev=-152, azim=-26)
   xx = np.linspace(X_new[:, 0].min()-2, X_new[:, 0].max() +2, 50)
   yy = np.linspace(X_new[:, 1].min()-2, X_new[:, 1].max() +2, 50)
   XX, YY = np.meshgrid(xx, yy)
   ZZ = (coef[0] * XX + coef[1] * YY + intercept) / -coef[2]
   ax.plot_surface(XX, YY, ZZ, rstride=8, cstride=8, alpha=0.3)
   ax.scatter(X_new[mask, 0], X_new[mask, 1], X_new[mask, 2], c='b',label='Class 0' , s=60)
   ax.scatter(X_new[~mask, 0], X_new[~mask, 1], X_new[~mask, 2], c='r',
   marker='^',label='Class 1' , s=60)
   ax.set_xlabel("Feature 0")
   ax.set_ylabel("Feature 1")
   ax.set_zlabel("Feature 1 **2")
   ax.legend()
plt.show()
Output -:
[False True False False False True True False False False True True
 False False False True True False False True True False True True
 False True True False True False False False False False True True
 False True False True False False True True True True False True
 False False False True True True False True True True True False
 False True False True True False False False True True False True
  True False True True False True False False True True False True
 False True True True True False False True False True True False
 False False False False]
[ True False True True True False False True True True False False
  True True True False False True True False False True False False
  True False False True False True True True True True False False
  True False True False True True False False False False True False
  True True True False False False True False False False False True
  True False True False False True True True False False True False
 False True False False True False True True False False True False
  True False False False False True True False True False False True
  True True True True]
ASSGNMENT -09
1. Write a program to implement Decision Tree Classifier in python.
   Source Code -:
   from sklearn.model_selection import train_test_split
   import matplotlib.pyplot as plt
   from sklearn.datasets import load_breast_cancer
   from sklearn.tree import DecisionTreeClassifier, plot_tree
   cancer = load_breast_cancer()
   X_train,X_test,Y_train,Y_test=train_test_split(cancer.data,cancer.target,stratify=cancer.tar
   get,random_state=42)
   '''
   tree=DecisionTreeClassifier(random_state=0)
   tree.fit(X_train,Y_train)
   print("Accuracy on trining set: {:.3f}".format(tree.score(X_train,Y_train)))
   print("Accuracy on test set: {:.3f}".format(tree.score(X_test,Y_test)))
   '''
   tree=DecisionTreeClassifier(max_depth=3,random_state=0)
   tree.fit(X_train,Y_train)
   print("Accuracy on trining set: {:.3f}".format(tree.score(X_train,Y_train)))
   print("Accuracy on test set: {:.3f}".format(tree.score(X_test,Y_test)))
   from sklearn.tree import export_graphviz
   export_graphviz(tree,out_file="tree.dot",class_names=["malignant","benign"],feature_na
   mes=cancer.feature_names,impurity=False,filled=True)
   import graphviz
   with open("tree.dot")as f:
       dot_graph=f.read()
   print(dot_graph)
   plt.figure(figsize=(12, 8))
   plot_tree(tree,
           filled=True,
           feature_names=cancer.feature_names,
           class_names=["malignant", "benign"],
           rounded=True,
           fontsize=10)
   plt.title("Decision Tree for Breast Cancer Classification")
   plt.show()
   Output -:
   Accuracy on trining set: 0.977
   Accuracy on test set: 0.944
   digraph Tree {
   node [shape=box, style="filled", color="black", fontname="helvetica"] ;
   edge [fontname="helvetica"] ;
   0 [label="worst radius <= 16.795\nsamples = 426\nvalue = [159, 267]\nclass = benign",
   fillcolor="#afd7f4"] ;
   1 [label="worst concave points <= 0.136\nsamples = 284\nvalue = [25, 259]\nclass =
   benign", fillcolor="#4ca6e8"] ;
   0 -> 1 [labeldistance=2.5, labelangle=45, headlabel="True"] ;
   2 [label="radius error <= 1.048\nsamples = 252\nvalue = [4, 248]\nclass = benign",
   fillcolor="#3c9fe5"] ;
   1 -> 2 ;
   3 [label="samples = 251\nvalue = [3, 248]\nclass = benign", fillcolor="#3b9ee5"] ;
   2 -> 3 ;
   4 [label="samples = 1\nvalue = [1, 0]\nclass = malignant", fillcolor="#e58139"] ;
   2 -> 4 ;
   5 [label="worst texture <= 25.62\nsamples = 32\nvalue = [21, 11]\nclass = malignant",
   fillcolor="#f3c3a1"] ;
1 -> 5 ;
6 [label="samples = 12\nvalue = [3, 9]\nclass = benign", fillcolor="#7bbeee"] ;
5 -> 6 ;
7 [label="samples = 20\nvalue = [18, 2]\nclass = malignant", fillcolor="#e88f4f"] ;
5 -> 7 ;
8 [label="texture error <= 0.473\nsamples = 142\nvalue = [134, 8]\nclass = malignant",
fillcolor="#e78945"] ;
0 -> 8 [labeldistance=2.5, labelangle=-45, headlabel="False"] ;
9 [label="samples = 5\nvalue = [0, 5]\nclass = benign", fillcolor="#399de5"] ;
8 -> 9 ;
10 [label="worst concavity <= 0.191\nsamples = 137\nvalue = [134, 3]\nclass =
malignant", fillcolor="#e6843d"] ;
8 -> 10 ;
11 [label="samples = 5\nvalue = [2, 3]\nclass = benign", fillcolor="#bddef6"] ;
10 -> 11 ;
12 [label="samples = 132\nvalue = [132, 0]\nclass = malignant", fillcolor="#e58139"] ;
10 -> 12 ; }
                            ASSGNMENT -10
1. Write a program to implement K Mean Clustering using python. Given dataset is: A1(2,10);
   A2(2,5); A3(8,4); A4(5,8); A5(7,5); A6(6,4); A7(1,2); A8(4,9).
   Source code -:
   import numpy as np
   import matplotlib.pyplot as plt
   from sklearn.cluster import KMeans
   data = np.array([[2, 10], [2, 5], [8, 4], [5, 8], [7, 5], [6, 4], [1, 2], [4, 9]])
   k=3
   kmeans = KMeans(n_clusters=k, random_state=0)
   kmeans.fit(data)
   centroids = kmeans.cluster_centers_
   labels = kmeans.labels_
   plt.scatter(data[:, 0], data[:, 1], c=labels, cmap='rainbow', marker='o', label='Data
   points')
   plt.scatter(centroids[:, 0], centroids[:, 1], s=200, c='black', marker='X', label='Centroids')
   plt.title(f"K-means Clustering with k={k}")
   plt.xlabel('X Coordinate')
   plt.ylabel('Y Coordinate')
   plt.legend()
   plt.show()
   print("Cluster Centroids:\n", centroids)
   print("Cluster Labels:\n", labels)
   Output -:
   Cluster Centroids:
   [[7.      4.33333333]
   [3.66666667 9.       ]
   [1.5      3.5     ]]
   Cluster Labels:
   [1 2 0 1 0 0 2 1]
                         ASSGNMENT -11
1. Write a program to implement Ward’s Algorithm without using linkage and dendrogram.
   Given dataset in (x,y): 1(4,4); 2(8,4); 3(15,8); 4(24,12); 5(24,12).
   Source Code -:
   import numpy as np
   import matplotlib.pyplot as plt
   def get_data_points():
     n = int(input("Enter the number of data points: "))
     data = []
   print("Enter the coordinates of each data point (x y):")
   for _ in range(n):
       x, y = map(float, input().split())
       data.append((x, y))
   return np.array(data)
data = get_data_points()
def euclidean_distance(a, b):
   return np.sqrt(np.sum((a - b) ** 2))
n = len(data)
distance_matrix = np.zeros((n, n))
for i in range(n):
   for j in range(i + 1, n):
       distance_matrix[i, j] = euclidean_distance(data[i], data[j])
       distance_matrix[j, i] = distance_matrix[i, j]
clusters = [[i] for i in range(n)]
positions = np.arange(n)
def ward_distance(c1, c2):
   combined_cluster = np.vstack((data[c1], data[c2]))
   mean_combined = np.mean(combined_cluster, axis=0)
   variance = np.sum((combined_cluster - mean_combined) ** 2)
   return variance
merge_history = []
heights = []
while len(clusters) > 1:
   min_distance = float('inf')
   clusters_to_merge = (None, None)
   for i in range(len(clusters)):
       for j in range(i + 1, len(clusters)):
           dist = ward_distance(clusters[i], clusters[j])
           if dist < min_distance:
               min_distance = dist
               clusters_to_merge = (i, j)
               points_i = [f"({data[p][0]}, {data[p][1]})" for p in clusters[i]]
               points_j = [f"({data[p][0]}, {data[p][1]})" for p in clusters[j]]
               print(f"Distance between clusters {points_i} and {points_j}: {dist}")
   i, j = clusters_to_merge
   new_cluster = clusters[i] + clusters[j]
   clusters = [clusters[k] for k in range(len(clusters)) if k not in (i, j)]
   clusters.append(new_cluster)
   new_position = (positions[i] + positions[j]) / 2
   positions = np.delete(positions, [i, j])
   positions = np.append(positions, new_position)
   merge_history.append((i, j))
   heights.append(min_distance)
def plot_dendrogram(merge_history, heights):
   plt.figure(figsize=(12, 6))
   current_positions = np.arange(n)
   colors = plt.cm.viridis(np.linspace(0, 1, len(merge_history)))
   for idx, (merge, height) in enumerate(zip(merge_history, heights)):
       i, j = merge
       plt.plot([current_positions[i], current_positions[i]], [0, height], color=colors[idx])
       plt.plot([current_positions[j], current_positions[j]], [0, height], color=colors[idx])
       plt.plot([current_positions[i], current_positions[j]], [height, height], color=colors[idx])
       new_position = (current_positions[i] + current_positions[j]) / 2
       current_positions = np.delete(current_positions, [i, j])
       current_positions = np.append(current_positions, new_position)
   for idx, pos in enumerate(np.arange(n)):
      plt.text(pos, -0.5, f'({data[idx][0]}, {data[idx][1]})',
             ha='center', va='top', fontsize=12, color='red')
   plt.title("Dendrogram ")
   plt.xlabel("Data Points")
   plt.ylabel("Distance")
   plt.grid(True, linestyle='--', alpha=0.7)
   plt.tight_layout()
   plt.show()
plot_dendrogram(merge_history, heights)
 Output -:
Enter the number of data points: 5
Enter the coordinates of each data point (x y):
44
84
15 8
24 4
24 12
Distance between clusters ['(4.0, 4.0)'] and ['(8.0, 4.0)']: 8.0
Distance between clusters ['(15.0, 8.0)'] and ['(24.0, 4.0)']: 48.5
Distance between clusters ['(24.0, 4.0)'] and ['(24.0, 12.0)']: 32.0
Distance between clusters ['(15.0, 8.0)'] and ['(4.0, 4.0)', '(8.0, 4.0)']: 72.66666666666666
Distance between clusters ['(24.0, 4.0)', '(24.0, 12.0)'] and ['(15.0, 8.0)', '(4.0, 4.0)', '(8.0,
4.0)']: 383.2