Machine Learning Mid-Sem Exam Answers
Part-A
Q1. Fill in the blanks
In supervised learning, the algorithm learns from labeled data, while in unsupervised
learning, the algorithm learns from unlabeled data.
Q2. What is reinforcement learning?
Reinforcement Learning (RL) is a type of machine learning where an agent interacts with an
environment to achieve a goal. The agent takes actions, receives feedback in the form of
rewards or penalties, and learns to maximize its long-term rewards. Example: Training a
robot to walk by rewarding successful steps and penalizing falls.
Q3. How does supervised learning differ from unsupervised learning?
Feature Supervised Learning Unsupervised Learning
Data Type Uses labeled data Uses unlabeled data
Examples Classification, Regression Clustering, Dimensionality
Reduction
Q4. What is overfitting in machine learning? Why does it happen?
Overfitting occurs when a model learns the noise in training data instead of the actual
pattern, leading to poor performance on new, unseen data.
Reasons for Overfitting:
1. Too many features or parameters.
2. Insufficient training data.
3. The model is too complex for the dataset.
Solution: Use regularization, cross-validation, or collect more data.
Part-B
Q5. Discuss how features impact the performance of machine learning models.
Features play a crucial role in determining the accuracy and efficiency of machine learning
models.
Feature Engineering Techniques:
1. Feature Selection – Choose only the most relevant features.
2. Feature Extraction – Transform raw data into a meaningful format.
3. Feature Scaling – Normalize data to improve learning efficiency.
Q6. How does a regression problem differ from a classification problem?
Feature Regression Classification
Definition Predicts continuous values Predicts discrete categories
Examples House price prediction Spam detection
Q7. What are classification algorithms? Explain any one.
Classification algorithms categorize data into predefined groups. Examples include Decision
Trees, Naïve Bayes, and Neural Networks.
Example: Decision Tree
A Decision Tree is a tree-like structure where each node represents a decision rule based on
feature values.
Advantages: Easy to interpret, handles both numerical and categorical data.
Disadvantages: Prone to overfitting if too deep.
Part-C
Q8. How does the k-Nearest Neighbors (k-NN) algorithm work? Describe its advantages and
disadvantages.
k-Nearest Neighbors (k-NN) is a simple, non-parametric algorithm used for classification
and regression.
Working of k-NN:
1. Choose the number of neighbors (k).
2. Find the ‘k’ closest data points using distance measures (Euclidean, Manhattan).
3. Assign the most common class among the ‘k’ neighbors.
Advantages:
- Simple and effective.
- Works well with small datasets.
- No training phase required.
Disadvantages:
- Slow for large datasets.
- Sensitive to irrelevant features.
- Requires proper selection of ‘k’ for best performance.
Q9. Explain supervised and unsupervised learning in detail with examples and applications.
Supervised Learning:
- Uses labeled data to predict outcomes.
- Example: Spam detection – training data contains emails labeled as spam or not spam.
- Applications: Speech recognition, medical diagnosis, fraud detection.
Unsupervised Learning:
- Works with unlabeled data and finds hidden patterns.
- Example: Customer segmentation – grouping customers based on purchasing behavior.
- Applications: Anomaly detection, recommendation systems, market analysis.
OR
Q9. What do you mean by data pre-processing? Write down its advantages and
disadvantages.
Data Preprocessing is the process of cleaning, transforming, and organizing raw data to
improve model accuracy and performance.
Steps in Data Preprocessing:
1. Data Cleaning – Handle missing or incorrect data.
2. Feature Scaling – Normalize or standardize data.
3. Feature Selection – Remove irrelevant features.
Advantages:
- Improves model accuracy.
- Reduces training time.
- Enhances generalization.
Disadvantages:
- Time-consuming.
- May cause loss of useful information.
- Requires domain expertise for proper feature selection.