Machine Learning (CS-401T)
Assignment-1
                                               Submission Deadline: 17/10/2024
1.   Define Logistic Regression. Explain how it differs from linear regression.
     Consider the following dataset representing whether a student will pass or fail an exam based on their study hours:
     Study Hours         Pass (1) / Fail (0)
     1                   0
     2                   0
     3                   0
     4                   1
     5                   1
     6                   1
     7                   1
     8                   1
     Using the dataset, perform the following:
     i) Fit a logistic regression model using a suitable library (e.g., Scikit-learn) and obtain the coefficients.
     ii) Calculate the predicted probabilities of passing the exam for each study hour.
     iii) Determine the classification outcome (Pass/Fail) using a threshold of 0.5.
2.   Discuss the structure of an ANN, explaining the roles of the input layer, hidden layers, and output layer, Activation
     functions in processing information.
     Consider a 2-layer feed forward neural network with two input neurons, one hidden layer consisting of two
     neurons, and one output neuron. The network uses the sigmoid activation function at both layers.
     The inputs, weights, and biases are as follows:
     Input vector: X=[0.6,0.9]
     Weights for the hidden layer: Wh1=[0.3,0.5],Wh2=[0.4,0.7]
     Weights for the output layer: Wo=[0.6,0.8]
     Biases for the hidden neurons: bh1=0, bh2=0.
     Bias for the output neuron: bo=0.4
     a)   Perform the feed-forward pass, showing all calculations for the activations of the hidden layer neurons and
          the output neuron. Use the Sigmoid activation function.
     b) b) Briefly explain how the back-propagation algorithm works to adjust the weights and biases of the network.
3.   What is the Gradient Descent algorithm? Describe its purpose in machine learning. Consider a simple linear
     regression model where the goal is to fit a line to a set of data points. The model can be written as:
     y = w0 + w1 x
     Given training data:
                   x               y
                  1.0             2.0
                  2.0             2.8
                  3.0             3.6
                  4.0             4.4
     The initial weights are w0 = 0.5 and w1 = 0.5, and the learning rate is α = 0.01. The loss function is the Mean
     Squared Error (MSE):
     a)   Compute     the    gradient     of    the   loss    function      with     respect    to     w0            and   w1 .
     b)   Using Gradient Descent, update the weights w0 and w1 after one iteration. Show all calculations.
4.   Define Decision Trees. Discuss their structure and how they make decisions based on input features.
     Consider the following dataset used for classifying whether a customer will buy a product based on their age,
     income, and marital status. The training data is as follows:
          Age      Income      Marital Status        Bought Product (Target)
          25       50000       Single                No
          30       60000       Single                Yes
          35       80000       Married               Yes
          40       90000       Married               Yes
          45       100000      Single                No
     a)   Compute the Gini Impurity for the entire dataset. Show all calculations.
     b)   Based on the provided data, calculate the Gini Impurity for a potential split on Marital Status (Single vs.
          Married) and determine if this is a good split. Show all calculations.