KEMBAR78
Weka Attribute Selection Guide | PDF | Applied Mathematics | Statistics
0% found this document useful (0 votes)
123 views5 pages

Weka Attribute Selection Guide

The document discusses attribute selection and ranking using information gain on a dataset with 5342 instances and 11 attributes. It shows that the top 5 ranked attributes by information gain are pgift, rfa_2a, rfa_2f=1, rfa_2f=4, and pepstrfl. Using best first search with a CFS subset evaluator, the selected attributes were Lastdate, pgift, rfa_2f=1, rfa_2f=4, and rfa_2a, totaling 5 attributes.

Uploaded by

Amelia Abera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
123 views5 pages

Weka Attribute Selection Guide

The document discusses attribute selection and ranking using information gain on a dataset with 5342 instances and 11 attributes. It shows that the top 5 ranked attributes by information gain are pgift, rfa_2a, rfa_2f=1, rfa_2f=4, and pepstrfl. Using best first search with a CFS subset evaluator, the selected attributes were Lastdate, pgift, rfa_2f=1, rfa_2f=4, and rfa_2a, totaling 5 attributes.

Uploaded by

Amelia Abera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Question 1: How many attributes do you now see in the attributes window?

What possible
values do the new attributes take?

Question 2: Now, how many attributes do you see in the attributes window? What possible
values do the new attributes take?
Question 3: Select the attribute and look at the “selected attribute‟ box. What “type” of
attribute do you now have? What is the label for the first category? What is the category with
the least number of observations?

Question 4: How many instances does Weka show in your dataset after sampling?
Question 5: What are the first three attributes ranked by information gain?

=== Run information ===

Evaluator: weka.attributeSelection.InfoGainAttributeEval
Search: weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N -1
Relation: learn-weka.filters.unsupervised.attribute.Remove-R10-
weka.filters.supervised.instance.Resample-B1.0-S1-Z10.0-
weka.filters.unsupervised.attribute.NumericTransform-R4-Cjava.lang.Math-Mlog-
weka.filters.unsupervised.attribute.NominalToBinary-N-R5-
weka.filters.unsupervised.attribute.Discretize-B10-M-1.0-R2-
weka.filters.unsupervised.instance.Resample-S1-Z70.0-
weka.filters.unsupervised.attribute.Remove-R12-
weka.filters.unsupervised.instance.Resample-S1-Z80.0
Instances: 5342
Attributes: 11
Income
Firstdate
Lastdate
pgift
rfa_2f=1
rfa_2f=2
rfa_2f=3
rfa_2f=4
rfa_2a
pepstrfl
target_b
Evaluation mode: evaluate on all training data
=== Attribute Selection on all input data ===

Search Method:
Attribute ranking.

Attribute Evaluator (supervised, Class (nominal): 11 target_b):


Information Gain Ranking Filter

Ranked attributes:
0.03570457 4 pgift
0.02043709 9 rfa_2a
0.01559746 5 rfa_2f=1
0.01508869 8 rfa_2f=4
0.00951277 10 pepstrfl
0.00679244 2 Firstdate
0.006746 3 Lastdate
0.00233461 1 Income
0.00215911 7 rfa_2f=3
0.00000465 6 rfa_2f=2

Selected attributes: 4,9,5,8,10,2,3,1,7,6 : 10


Question 6: Which attributes were selected? How many are they?
=== Attribute Selection on all input data ===

Search Method:
Best first.
Start set: no attributes
Search direction: forward
Stale search after 5 node expansions
Total number of subsets evaluated: 60
Merit of best subset found: 0.031

Attribute Subset Evaluator (supervised, Class (nominal): 11 target_b):


CFS Subset Evaluator
Including locally predictive attributes

Selected attributes: 3,4,5,8,9 : 5


Lastdate
pgift
rfa_2f=1
rfa_2f=4
rfa_2a

You might also like