Ganpat University
Machine Learning (2ICT703)
                                Assignment-2
                                                           Faculty: Prof. Urmila Patel
                                                           Dt: 01/10/2024
Date of Submission: 10/10/2024
Q-1      A study was conducted at Virginia Tech to determine if certain static
         arm-strength measures have an influence on the “dynamic lift”
         characteristics of an individual. Twenty-five individuals were subjected to
         strength tests and then were asked to perform a weightlifting test in which
         weight was dynamically lifted overhead. The data are given here.
                Individual        Arm Strength, X    Dynamic Lift, Y
                1                 17.3               71.7
                2                 19.3               48.3
                3                 19.5               88.3
                4                 19.7               75.0
                5                 22.9               91.7
                6                 23.1               100.0
                7                 26.4               73.3
                8                 26.8               65.0
                9                 27.6               75.0
                10                28.1               88.3
                11                28.2               68.3
                12                28.7               96.7
                13                29.0               76.7
                14                29.6               78.3
                15                29.9               60.0
                16                29.9               71.7
                17                30.3               85.0
                18                31.3               85.0
                19                36.0               88.3
                20                39.5               100.0
                21                40.4               100.0
                22                44.3               100.0
                23                44.6               91.7
                24                50.4               100.0
                25                55.9               71.7
Where,
E(Y) = Estimated value of Y
β0/ C = Constant/ intersection point with Y
β1/ M = Slope
    (a) Estimate β0/ C and β1/ M - for the linear regression E(Y) = β0 + β1* X   / E(Y) =
        M*X + C
    (b) Find the value of (Coefficient of determination) R2
    (c) Plot the data E(Y)
Q-2        The grades of a class of 9 students on a midterm report (x) and on the final
           examination (y) are as follows:
X        77        50       71         72      81        94         96      99       67
Y        82        66       78         34      47        85         99      99       68
Where,
E(Y) = Estimated value of Y
β0/ C = Constant/ intersection point with Y
β1/ M = Slope
    (a) Estimate β0/ C and β1/ M - for the linear regression E(Y) = β0 + β1* X / E(Y) =
        M*X + C
    (b) Find the value of (Coefficient of determination) R2
    (c) Plot the data E(Y)
    (d) Estimate the final examination grade of a student who received a grade of 85 on the
        midterm report.
Q-3        A study was made by a retail merchant to determine the relation between
           weekly advertising expenditures and sales.
                                 Advertisi
                                 ng Costs
                                      ($)               Sales ($)
                                       40                    385
                                       20                    400
                                       25                    395
                                       20                    365
                                       30                     475
                                       50                     440
                                       40                     490
                                       20                     420
                                       50                     560
                                       40                     525
                                       25                     480
                                       50                     510
Where,
E(Y) = Estimated value of Y
β0/ C = Constant/ intersection point with Y
β1/ M = Slope
   (a) Estimate β0/ C and β1/ M - for the linear regression E(Y) = β0 + β1* X   / E(Y) =
       M*X + C
   (b) Find the value of (Coefficient of determination) R2
   (c) Plot the data E(Y)
   (d) Estimate the weekly sales when advertising costs are $35.
Q-4        A study of the amount of rainfall and the quantity of air pollution removed
           produced the following data:
                         Daily                Particulate Removed,
                         Rainfall,            Y (μg/m3)
                         X (0.01)
                                     4.3                             126
                                     4.5                             121
                                     5.9                             116
                                     5.6                             118
                                     6.1                             114
                                     5.2                             118
                                     3.8                             132
                                     2.1                             141
                                     7.5                             108
Where,
E(Y) = Estimated value of Y
β0/ C = Constant/ intersection point with Y
β1/ M = Slope
   (a) Estimate β0/ C and β1/ M - for the linear regression E(Y) = β0 + β1* X   / E(Y) =
       M*X + C
   (b) Find the value of (Coefficient of determination) R2
   (c) Plot the data E(Y)
Q-5        Consider the regression of mileage for certain automobiles, measured in
           miles per gallon (mpg) on their weight in pounds (wt.). The data are from
           Consumer Reports.
                              MODEL             WT         MPG
                              General
                              Motor             4520           15
                              Tata              2065           29
                              Honda             2440           31
                              Hyundai           2290           28
                              Suzuki            3195           23
                              Isuzu             3480           21
                              Jeep              4090           15
                              Land-rove
                              r                 4535           13
                              Lexus             3390           22
                              Ferrari           3930           18
Where,
E(Y) = Estimated value of Y
β0/ C = Constant/ intersection point with Y
β1/ M = Slope
   (a) Estimate β0/ C and β1/ M - for the linear regression E(Y) = β0 + β1* X / E(Y) =
        M*X + C
    (b) Find the value of (Coefficient of determination) R2
    (c) Plot the data E(Y)
    (d) Estimate the mileage for a vehicle weighing 4000 pounds.
    (e) Suppose that Honda engineers claim that, on average, the Civic (or any other model
      weighing 2440 pounds) gets more than 30 mpg. Based on the results of the regression
      analysis, would you believe that claim? Why or why not?
Q-6        Consider the regression of infants at birth. The data are as per follow.
                                                Chest Size
                                 Weight (kg)    (cm)
                                         2.75           29.5
                                         2.15           26.3
                                         4.41           32.2
                                         5.52           36.5
                                         3.21           27.2
                                         4.32           27.7
                                            2.31             28.3
                                             4.3             30.3
                                            3.71             28.7
Where,
E(Y) = Estimated value of Y
β0/ C = Constant/ intersection point with Y
β1/ M = Slope
    (a) Estimate β0/ C and β1/ M - for the linear regression E(Y) = β0 + β1* X / E(Y) =
        M*X + C
    (b) Find the value of (Coefficient of determination) R2
    (c) Plot the data E(Y)
    (d) What percentage of the variation in infant chest sizes is explained by the difference in
        weight?
Q-7 Apply One-HOT Encoder 2nd index and again apply the 3rd index of column on given
      below data set.
Day             Outlook         Wind            Play Tennis
1               Sunny           Weak            No
2               Sunny           Strong          No
3               Overcast        Weak            Yes
4               Rain            Weak            Yes
5               Rain            Weak            Yes
6               Rain            Strong          No
7               Overcast        Strong          Yes
Q-8 Apply One-HOT Encoder 2nd index and again apply the 5th index of column on given
    below data set.
Day     Outlook          Temperature       Humidity       Wind     Play
                                                                   Tennis
D1      Sunny            Hot               High           Weak     No
D2      Sunny            Hot               High           Strong   No
D3      Overcast         Hot               High           Weak     Yes
D4      Rain             Mild              High           Weak     Yes
D5      Rain             Cool              Normal         Weak     Yes
D6      Rain             Cool              Normal         Strong   No
Q-9 Apply Label Encoder 2nd index and again apply on 2nd index of column on given
      below data set.
Day               Outlook         Wind              PlayTennis
1                 Sunny           Weak              No
2                 Sunny           Strong            No
3                 Overcast        Weak              Yes
4                 Rain            Weak              Yes
5                 Rain            Weak              Yes
6                 Rain            Strong            No
7                 Overcast        Strong            Yes
Q- 10 For given 1 – dimensional dataset X = { -5, 0, 23.0,17.6, 9.23, 1.11} Normalize the
     dataset using Min-Max Normalization [0,1].
Q-11 What is feature scaling? For given 2 – dimensional dataset, Normalize the dataset
     using Min-Max Normalization [0,1] and Standard Scalar [-1,1].
Q-12 Explain Bias, Variance , Overfitting and Underfitting for machine learning models.
Q-13 Explain various classifier performance measures. (accuracy, precision, recall,
     sensitivity, specificity)
Q-14 Describe confusion matrix
Q-15 Explain the following confusion matrix terms for machine learning approach
     and find the result of following term:
    1. Accuracy 2. Precision 3. Recall 4. Specificity 5. F1 Score
           ----------------------------------------------------------------------------------------