KEMBAR78
Weka Tutorial | PDF | Cross Validation (Statistics) | Statistical Classification
0% found this document useful (0 votes)
462 views2 pages

Weka Tutorial

This document provides an overview of the Weka machine learning toolkit and outlines exercises for learning how to use Weka to preprocess data, build classifiers, visualize data and models, perform clustering, and generate association rules. The exercises guide users through loading various datasets into Weka and applying common machine learning algorithms like decision trees, Naive Bayes, neural networks, and association rule learning. Performance of the models is evaluated using measures like accuracy, confusion matrices, and rule support.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
462 views2 pages

Weka Tutorial

This document provides an overview of the Weka machine learning toolkit and outlines exercises for learning how to use Weka to preprocess data, build classifiers, visualize data and models, perform clustering, and generate association rules. The exercises guide users through loading various datasets into Weka and applying common machine learning algorithms like decision trees, Naive Bayes, neural networks, and association rule learning. Performance of the models is evaluated using measures like accuracy, confusion matrices, and rule support.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Weka Tutorial

1. Downloading and Installing Weka (version 3.6) Website: http://www.cs.waikato.ac.nz/m1/weka/ You can also use the User manual and the documentation installed with Weka or you can download them from the website 2. Weka Textbook The primary reference for the Weka tutorials is written and Franks book Data Mining: Practical Learning Tools and Techniques, 2nd Edition 3. Downloadable Datasets The largest and most famous library for Machine Learning Datasets is UCI Machine Learning Repository (http://archive.ics.uci.edu/m1/index.html) This tutorial exercises introduce WEKA and ask you to try out several machine learning, visualization, and preprocessing methods using a wide variety of datasets: A. Learners: decision tree learner(J48), instance-based learner (1Bk), Nave Bayes (NB),Nave Bayes Multinomial (NBM), support vector machine(SMO), association rule learner(Apriori) B. Meta-learners: filtered classifier, attribute selected classifiers C. Visualization: visualize datasets, decision trees, classification errors D. Preprocessing: remove attributes and instances E. Testing: on training set, on supplied test set, using cross-validation, confidence and support of association rules Exercise 1: Set up your environment and start the Explorer, Look at the Preprocess, Classify, and Visualize panels 1. Load a dataset (weather nominal) and look at it. Apply a filter (to remove attributes and instances). 2. In Visualize: i. Load a dataset (iris) and visualize it. ii. Examine instance info 3. In Classify: i. Consider the data, weather nominal dataset, you will find it under Weka directory in the data folder for example (c:\Program Files\Weka-3-6\data). Now build a decision tree. - Using different techniques - List five criteria for evaluating these classification methods

Ii Iii Iv V Exercise 2:

examine the tree in the Classifier output panel visualize the tree interpret classification accuracy and confusion matrix visualize classifier errors

Import the dataset (segment-challenge.arff) you will find it under Weka directory in the data folder for example(c:\Program Files\Weka-3-6\data). and apply on it the following tasks: 1. Apply the J48 decision tree (use 10 folds cross validation, percentage split), MultilayerPerceptron(use 10 folds cross validation and using percentage split), and Nave Bayes classifier (using the training data , use 10 folds cross validation and using percentage split) then i. Visualize the curves ii. Compare the confusion matrices iii. Compare the accuracy Try to analyze the accuracy estimates on (1) train data, (2) cross-validation,(3) train/test split. Report major findings. Repeat the above questions using some other Tree, Function, Rule based method. 2. Apply the Simple Means cluster specifying 6 classes (using classes to clusters evaluation mode) i. ii. Apply Visualize (clusters axis, Variable: Y: by changing color, etc.) By changing the different cluster mode.

3. Import the (contact-lenses-ariff) data set and apply the Predictive Apiori algorithm to extract the hidden rules among the data Exercise 3: Introduce the datasets vote, weather nominal and supermarket. 1. Apply an association ruler learner (Apriori): i. Discuss the meaning of the rules ii. Identify the support and number of instances predicted correctly of certain rules 2. Make association rules for the supermarket dataset: I. Load supermarket II. Generate association rules and discuss some inferences you would make from them

You might also like