KEMBAR78
Decision Tree Question | PDF | Machine Learning | Applied Mathematics
0% found this document useful (0 votes)
22 views6 pages

Decision Tree Question

The document contains a series of questions related to decision trees, information gain, error rates, and model performance metrics. It includes calculations for entropy, Gini index, and performance drops in various scenarios. The questions also explore the application of decision tree algorithms like ID3 and VFDT in different contexts.

Uploaded by

dharnamittal07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views6 pages

Decision Tree Question

The document contains a series of questions related to decision trees, information gain, error rates, and model performance metrics. It includes calculations for entropy, Gini index, and performance drops in various scenarios. The questions also explore the application of decision tree algorithms like ID3 and VFDT in different contexts.

Uploaded by

dharnamittal07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Q1 .

Consider the following data, where the Y label is whether or not the child goes out to
play.

Q2. For instance, the following table informs about decision making factors to play tennis
at outside for previous 14 days (use ID3 algorithm).
Q3. A dataset has the following class distributions before and after a split:

Before Split

• Class 1: 10 samples

• Class 2: 10 samples

After Split

Left Node:

• Class 1: 8, Class 2: 2

Right Node:

• Class 1: 2, Class 2: 8
Calculate the information gain if the entropy before the split is 1.

Q4 . A bank uses a decision tree with the following rules:

1. If credit score ≥ 700 → Approve

2. If income ≥ $50,000 & credit score < 700 → Approve

3. Otherwise → Reject

If a customer has:

• Credit Score: 680

• Income: $55,000

Will they be approved?

Q5. A VFDT model starts with an error rate of 12%, but after training on 500,000 instances,
the error drops to 7%.
Compute the percentage reduction in error.

Q6. A standard decision tree takes 3 seconds per 1,000 instances, while VFDT processes
10,000 instances per second.
How much faster is VFDT?

Q7 . An exhaustive search model evaluates 250 feature sets, while a heuristic model
evaluates only 1 million sets.
What percentage of the total feature space does the heuristic model check?

Q8. Draw Decision Tree for the given datatset .


Q9. A model has Class A accuracy: 90% and Class B accuracy: 80%.
After drift, accuracies drop to Class A: 78%, Class B: 72%.
Compute the overall drop in performance.

Q10 . A model initially had an accuracy of 75%, which dropped to 60% due to concept
drift.
After an adaptive method was applied, accuracy improved to 68%.
Calculate the percentage of lost accuracy recovered.

Q11. A real-time fraud detection model has the following accuracy over time:

• Week 1: 95%

• Week 2: 93%

• Week 3: 85%

• Week 4: 70%

Compute the overall percentage drop in accuracy.

Q12.

A dataset has three classes with the following proportions:

• Class A: 40%

• Class B: 35%

• Class C: 25%

Compute the entropy of the dataset.

Q13. A dataset is split into two subsets:

• Subset 1: (Class A = 20, Class B = 10)

• Subset 2: (Class A = 5, Class B = 15)

Compute the Gini Index after the split.

Q14 A batch decision tree takes O(n^2) time for training.


VFDT takes O(nlogn).
For n=100,000n = 100,000n=100,000, compute the ratio of their complexity.
Q15. A VFDT model has seen 100,000 instances, and two attributes have the following
observed information gains:

• IG1= 0.05

• IG2=0.04

Given a confidence threshold of δ =0.01, determine whether a split is made using the
Hoeffding Bound:

1
ϵ= √ln⁡(𝛿)/2𝑁

Q16. A dataset has a binary attribute A with values A1 and A2. The class probabilities are:

P(A1)=0.7,P(A2)=0.3

Calculate the entropy of A.

Q17. A decision tree considers a split on Feature A and Feature B. The dataset entropy is
initially 0.94.
After splitting:

• Feature A Split: weighted entropy = 0.75

• Feature B Split: weighted entropy = 0.68

Compute the entropy reduction percentage for each feature and determine which split is
preferred

Q18. A VFDT model receives 250,000 instances, and the confidence threshold is δ=0.05.
1
Compute the Hoeffding bound ϵ: ϵ= √ln⁡(𝛿)/2𝑁

Q19 . A dataset contains 4 classes with the following distribution before a split:

• Class A: 50 instances

• Class B: 30 instances

• Class C: 20 instances
• Class D: 100 instances

Compute the entropy before split.

Q20. Construct Decision Tree Using Gini Index

You might also like