KEMBAR78
ROC Curve Guide for Data Analysts | PDF | Receiver Operating Characteristic | Computational Science
0% found this document useful (0 votes)
84 views16 pages

ROC Curve Guide for Data Analysts

This document introduces receiver operating characteristic (ROC) curves, which are used to visualize and evaluate the performance of binary classifiers. ROC curves plot the true positive rate against the false positive rate for different classification thresholds. The area under the ROC curve (AUC) provides a single measure of classifier performance, where an AUC of 1 represents perfect classification and 0.5 represents random chance. The document demonstrates how to generate ROC curves using different thresholds on a probabilistic classifier's output and compares the performance of two classifiers by examining their ROC curves and AUC values. It also introduces the ROCR R package for plotting ROC curves to evaluate machine learning algorithms like support vector machines and neural networks.

Uploaded by

aaamoumenaaaa
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views16 pages

ROC Curve Guide for Data Analysts

This document introduces receiver operating characteristic (ROC) curves, which are used to visualize and evaluate the performance of binary classifiers. ROC curves plot the true positive rate against the false positive rate for different classification thresholds. The area under the ROC curve (AUC) provides a single measure of classifier performance, where an AUC of 1 represents perfect classification and 0.5 represents random chance. The document demonstrates how to generate ROC curves using different thresholds on a probabilistic classifier's output and compares the performance of two classifiers by examining their ROC curves and AUC values. It also introduces the ROCR R package for plotting ROC curves to evaluate machine learning algorithms like support vector machines and neural networks.

Uploaded by

aaamoumenaaaa
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

An Introduction to ROC Curve (Receiver Operating Characteristics)

Ming-Chang Lee Department of Information Management Yu Da College of Business

1/16

Outline
1. 2. 3. 4. 5.

Introduction Create an ROC curve Area Under an ROC Curve (AUC) R-package ROCR demo References

2/16

1. Introduction
History : Signal detection theory hit rates and false alarm rates Development:
Diagnostic system Medical decision making Machine learning

3/16

Classier Performance
Problem: two classes classification
Classification model Input (instance, I )

Actual class Actual class {p n} { p ,, n }


PS: actual class {p: positive class, n: negative class}

Predicted class classified {Y,N ) (instance, I }

4/16

Confusion matrix (Contingency table)


Given a classifier and an instance:
Classifier TRUE CLASS

Predicted class

p (positive)
True Positives False Negatives P

n (negative)
False Positives True Negatives N

Y N
Total

P = True Positives + False Negatives

5/16

Performance index
TP FN FP TN

TP FP TPR = = Recall , FPR = P N TP TP + TN Precision = , Accuracy = TP + FP P+N Sensitivity = Recall , Specificity = 1 FPR
6/16

ROC curve
Y axis: TPR X axis: FPR
(0,1)
Benefits (TP) Costs (FP)

(1,1)

(0,0)

(1,0)
7/16

Compare ROC curve

TP FN

FP TN

y=x

(0,0) Numbers of P =0, No FP error, No TP (0,1) perfect D classifiers Northwest location is better. Near x axis and on the left side Conservative e.g. A vs. B Near upper right-hand side Liberal Lower Lower Right ? Right ? (?) Triangle Triangle
8/16

2. Create an ROC curve


A ranking or scoring classifier can be used with a threshold to produce a binary classifier. If the classifier output is above the threshold, the classifier produces a Y, else a N.

9/16

Use thresholds to create ROC curve


(0.1,0.5)

If threshold =0.54 Numbers of Score 0.54

5 5 10

1 9 10

6 14 20

1 x : = 0.1 10 5 y : = 0.5 10

10/16

f(i) : the probabilistic classifier's estimate that instance i is positive;


min and max, the smallest and largest values returned by f; increment : the smallest difference between any two f values.

L Inputs: the set of test instances;

Conceptual Algorithm

11/16

Practical Algorithm

12/16

3. Area Under an ROC Curve (AUC)


AUC (Bradley, 1997) Wilcoxon test of ranks Area : Classifier B > A Average performance B>A

13/16

4. R demo ROCR package


package : ROCR plot ROC curve plot SVM vs. Neural Network

14/16

4. References
1.

Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms, Pattern Recognition, 30 (7), 1145-1159. Fawcett, T. (2003) ROC Graphs: Notes and Practical Considerations for Data Mining Researchers, HP Laboratories technical report. Witten, I.H. and Frank, E. (2005) Data Mining: Practical Machine Learning Tools and Techniques, Second Edition, Morgan Kaufmann. The magnificent ROC: http://www.anaesthetist.com/mnm/stats/roc/
15/16

2.

3.

4.

THANKS
Q&A
Web: http://web.ydu.edu.tw/~alan9956/ Email: alan9956@webmail.ydu.edu.tw
16/16

You might also like