KEMBAR78
Using Machine Learning in Networks Intrusion Detection Systems | PDF
Using Machine Learning in
Networks Intrusion Detection
Systems
OMAR SHAYA
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 1
Sections
✤ Introduction
✤ Intrusion Detection Methodologies
✤ A Machine Learning Based IDS (Intrusion Detection System)
✤ Challenges of Using Machine Learning in Intrusion Detection
✤ Summary
✤ References
✤ Appendix
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 2
INTRODUCITON
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 3
IDS: Intrusion Detection System
Increasing attacks on computer networks and the need
for automated detection
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 4
• Internet and computer systems have raised numerous security
and privacy issues
• Explosive use of networks due to many reasons e.g. internet,
wireless networks, cloud computing
• Thus, malicious attacks on networks have increased year over
year
• Need to automate systems that detect these attacks
• Based on on known attacks
• But what about attacks that were not seen before
• Machine learning?
INTRODUCTION
Definition: intrusion & intrusion detection
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 5
INTRODUCTION
“Intrusion is an attempt to compromise CIA
(Confidentiality, Integrity, Availability), or to bypass
the security mechanisms of a computer or network“
“Intrusion detection is the process of monitoring
the events occurring in a computer system or
network, and analyzing them for signs of intrusion”
INTRUSION DETECTION METHODOLOGIES
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 6
IDS: Intrusion Detection System
There are 3 main Detection Methodologies
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 7
• Signature-based Detection (SD)
• A signature is a string or pattern that corresponds to known attack or threat
• SD is a process to compare patterns against captured events for recognizing
possible intrusions
• Uses the knowledge accumulated by specific attacks and system vulnerabilities
• Also known as Knowledge-based Detection or Misuse Detection
• Anomaly-based Detection (AD)
• Anomaly is a deviation to “normal” behavior
• Profiles of normal derived from monitoring network traffic
• AD compares normal profiles with observed events to recognize attacks
• Stateful Protocol Analysis (SPA)
• SPA depends on vendor-developed generic profiles to specific protocols
• Protocols based on standards from international standard organizations
• Hybrid IDS use multiple methodologies
• SD and AD are complementary methods, former concerns with certain attacks
and the later focuses on unknown attacks
INTRUSION DETECTION METHODOLOGIES
There are 3 main Detection Methodologies
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 8
• Hybrid IDS use multiple methodologies
• E.g. SD and AD are complementary methods
• SD concerns with certain attacks and AD focuses on unknown attacks
INTRUSION DETECTION METHODOLOGIES
Signature-based Detection
(SD)*
Anomaly-based Detection
(AD)
Stateful Protocol Analysis
(SPA)
SD is a process to compare patterns
against captured events for
recognizing possible intrusions
AD compares normal profiles with
observed events to recognize attacks
SPA depends on vendor-developed
generic profiles to specific protocols
A signature is a string or pattern that
corresponds to known attack or threat
Anomaly is a deviation to “normal”
behavior
The stateful in SPA indicates that IDS
could know and trace the protocol
states (e.g., pairing requests with
replies)
Uses the knowledge accumulated by
specific attacks and system
vulnerabilities
Profiles of normal derived from
monitoring network traffic
Protocols based on standards from
international standard organizations
* Also known as Knowledge-based Detection or Misuse Detection
Pros and cons of Intrusion Detection Methods
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 9
INTRUSION DETECTION METHODOLOGIES
Table 1: Pros and Cons of intrusion detection methodologies. Source [2]
Signature-based Detection
(SD)
Anomaly-based Detection
(AD)
Stateful Protocol Analysis
(SPA)
• Simplest and effective method to
detect attacks
• Detail contextual analysis
• Effective to detect new and
unforeseen vulnerabilities
• Less dependent on OS
• Facilitate detections of privilege
abuse
• Know and trace protocol states
• Distinguish unexpected sequences
of commands
• Ineffective with unknown attacks
and variants of known attacks
• Little understanding to states and
protocols
• Hard to keep signatures/patterns up
to date
• Time consuming to maintain the
knowledge
• Weak profiles accuracy due to
observed events
• Unavailable during rebuilding of
behavior profiles
• Difficult to trigger alerts in right time
• Resource consuming to protocol
state tracing and examination
• Unable to inspect attacks looking
like benign protocol behaviors
• Might be incompatible to dedicated
OSs or APs
PROSCONS
A MACHINE LEARNING BASED IDS
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 10
IDS: Intrusion Detection System
Machine learning in anomaly detection
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 11
• Anomaly-based Detection (AD)
• Easy when it is possible to characterize what is normal in the
data using simple mathematical model, e.g. normal distribution
• Most interesting real world systems have complex behavior that
doesn’t follow such distribution
• Machine learning is useful to learn the characteristics of the
system from observed data
• Feature Selection is the process of selecting a subset of relevant
features (variables, predictors) for use in model construction. Feature
selection techniques are used for three reasons:
• Simplification of models to make them easier to interpret
• Shorter training times
• Enhanced generalization by reducing overfitting
• Outlier Detection: an outlier is an observation point that is distant from
other observations
A MACHINE LEARNING BASED IDS
Robust Feature Selection and Robust PCA for Internet
Traffic Anomaly Detection
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 12
• Couples feature selection algorithm with outlier detection
method
• Uses robust statistics tools in both procedures
• Reliable results even with outliers’ presence
• Feature selection based on robust mutual estimator
• MI (Mutual Information): an information-theoretic metric that
captures both linear and non-linear dependencies
• Outlier detection on robust PCA (Principal Component Analysis)
• Mathematical procedure used to reduce dimensionality of a
problem
A MACHINE LEARNING BASED IDS
Robust Feature Selection and Robust PCA for Internet
Traffic Anomaly Detection
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 13
• Feature selection
• Important preprocessing step (filter)
• Reduce dimensionality with high-dimensional data
• Remove irrelevant data
• Increase learning accuracy
• Gives significant performance gains 

A MACHINE LEARNING BASED IDS
Robust Feature Selection and Robust PCA for Internet
Traffic Anomaly Detection
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 14
A MACHINE LEARNING BASED IDS
• Robust statistics
• Reliable results even in the
presence of outliers
Example:
• In normal distribution, the inner 95%
are in “center ± 1.96 X spread”
• Center: instead of mean, 

take the median
• Spread: instead of SD (standard
deviation), take the MAD (median
absolute deviation)
Source [1]
Dataset creation for training and testing (1/2)
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 15
• Dataset collected from mirroring traffic passing the switch of:
• Private laboratory network, 17 inter-connected PCs
• 10 for users producing licit traffic
• 1 for server, 1 for measurements
• 5 for attacks
• Licit traffic
• File sharing (BitTorrent)
• Video streaming (IPTV over TCP)
• Web browsing (HTTP)
• Attacks
• Botnets
• Port-scans: identify other targets vulnerable to infections
• Snapshots: type of identity theft for stealing personal information
• Other Botnet attacks are not used e.g. spyware, malware, denial of service, and
email spam
• Happen uniquely on host level
• Can be detected by e.g. anti-virus, monitoring at router/firewalls, email scanning
A MACHINE LEARNING BASED IDS
Dataset creation for training and testing (2/2)
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 16
• Customer usage profiles
• (a) Soft browsing (HTTP only)
• (b) File sharing machine (BitTorrent only)
• (c) File sharing user (BitTorrent and HTTP)
• (d) Heavy user (HTTP, BitTorrent, and
Streaming)
• Network scenarios
• (B) Business user
• 100% (a)
• (R) Residential user
• 30% (b), 40% (c), 30% (d)
• Attack intensities
• (1) 6% (5% snapshot, 1% port-scan)
• (2) 20% (15% snapshot, 5% port-scan)
• (3) 35% (30% snapshot, 5% port-scan)
A MACHINE LEARNING BASED IDS
Table 2. Source [1]
Results (1/3)
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 17
A MACHINE LEARNING BASED IDS
• 6 types of anomaly detectors A-B
• A: feature selection method, B Outlier
detection method
• R (robust)
• NR (non-robust)
• ∅ (no-method)
• Performance measures
• Nr Ftrs: number of selected features
• Recall: probability that an observation is
classified as anomaly when in fact it is an
anomaly
• False positive rate (FPR): probability that an
observation is classified as an anomaly when
in fact it is a regular observation
• Precision: probability of having an anomalous
observation given that it is classified as an
anomaly
Table 3. Source [1]
Results (2/3)
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 18
• R-R detector achieved the best
results
• Recall is always 1
• B1, B2, B3, R3 performance is maximum
• FPR and Precision are close to their optimal
• Improvement over non-robust
version is high
• Low recall means large percentage of
anomalies are not correctly identified
• B2, B3, R3 recall improved from 0.167,
0.273, and 0.125 to 1
• Feature selection
• Feature selection reduces Nr Ftrs, improves
performance
• B3 and R3: no feature selection sometimes
better than non-robust feature selection
A MACHINE LEARNING BASED IDS
Table 3. Source [1]
Results (3/3)
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 19
A MACHINE LEARNING BASED IDS
• Compare R-NR (top) and R-R
(bottom)
• Any point with score or distance
larger than a threshold (the lines) is
considered an anomaly
• R-NR case there is confusion
around snapshots
• thus poor recall value 0.125
• proximity in behavior between snapshots and
some HTTP and BitTorrent fools the non-robust
outlier detector
• All consist of small file uploads
Source [1]
Fig. 2.
Discussion
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 20
• There are advantages of using feature selection step and
using robust statistics for both feature selection and outlier
detection
• System achieves very high performance
• The system’s anomaly detector is adaptive to different traffic conditions (licit traffic
differs significantly in the two scenarios)
• However, the dataset used was obtained from a private lab
with 17 PCs, and not necessarily representative of a real
world scenario
• Need to show proof of the effectiveness of the system in larger scale network
traffic dataset
A MACHINE LEARNING BASED IDS
CHALLENGES OF USING MACHINE
LEARNING IN INTRUSION DETECTION
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 21
Outliers, cost of error, semantics, and evaluation
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 22
• Outlier detection
• Hard to define normal in network traffic as the usage varies in every
session and with new applications (diversity of network traffic)
• High cost of errors
• Cost of misclassification is extremely high
• False positive: expensive analyst time
• False negative: cause serious damage to an organization
• Error in other applications of ML not expensive e.g. product
recommendations, OCR, spam detection
• Semantic gap
• Currently it is only assessment of capability to identify deviations from
normal profile (could be good or bad)
• Need to interpret results from operator point of view, what does it mean?
• Difficulties with evaluation
• Designing sound evaluation schemes can be more difficult than the
detector itself
• Lack of public data sets for assessing anomaly detection
• Hard to gain real data set for many reasons e.g. leak of personal data
• Simulated data is not accurate
CHALLENGES OF USING MACHINE LEARNING IN INTRUSION DETECTION
SUMMARY
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 23
Summary
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 24
• Introduction
• The need for automated Intrusion Detection Systems
• Definition of Intrusion and Intrusion Detection

• Intrusion Detection Methodologies
• Signature-based Detection (SD)
• Anomaly-based Detection (AD)
• Stateful Protocol Analysis (SPA)

• Machine Learning Based IDS
• Using feature selection and robust statistics
• Dataset creation
• Results and evaluation
• Discussion

• Challenges of Using Machine Learning in ID
• Outlier detection, high cost of error, semantic gap, and difficulties with evaluation
SUMMARY
OMAR SHAYA –––––––– omar.shaya@stud.uni-goettingen.de
Thanks!
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 25
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 26
References
[1] C. Pasocal, M. Oliveira, R. Valdas, P. Filzmoser, P. Salvador and A. Pacheco. Robust Feature Selection and
Robust PCA for Internet Traffic Anomaly Detection. In Proceedings IEEE INFOCOM, pages 1755-1763, 2012
[2] H. Liao, C. Lin, Y. Lin and K. Tung. Intrusion Detection System: A Comprehensive Review. In Journal of
Network and Computer Applications, pages 16-24, 2013
[3] R. Sommer and V. Paxson. Outside the Closed World: On Using Machine Learning For Network Intrusion
Detection. In IEEE Symposium on Security and Privacy, pages 305-316, 2010
[4] Feature Selection. https://en.wikipedia.org/wiki/Feature_selection on 6 August 2015
[5] Outlier. https://en.wikipedia.org/wiki/Outlier on 6 August 2015
[6] Anomaly Detection – Using Machine Learning to Detect Abnormalities in Time Series Data. http://
blogs.technet.com/b/machinelearning/archive/2014/11/05/anomaly-detection-using-machine-learning-to-
detect-abnormalities-in-time-series-data.aspx on 6 August 2015
REFERENCES
Precision and Recall
Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 27
APPENDIX
Source: Dr. Stephan Sigg’s slides from Machine Learning and Pervasive Computing course SoSe 2015

Using Machine Learning in Networks Intrusion Detection Systems

  • 1.
    Using Machine Learningin Networks Intrusion Detection Systems OMAR SHAYA Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 1
  • 2.
    Sections ✤ Introduction ✤ IntrusionDetection Methodologies ✤ A Machine Learning Based IDS (Intrusion Detection System) ✤ Challenges of Using Machine Learning in Intrusion Detection ✤ Summary ✤ References ✤ Appendix Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 2
  • 3.
  • 4.
    Increasing attacks oncomputer networks and the need for automated detection Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 4 • Internet and computer systems have raised numerous security and privacy issues • Explosive use of networks due to many reasons e.g. internet, wireless networks, cloud computing • Thus, malicious attacks on networks have increased year over year • Need to automate systems that detect these attacks • Based on on known attacks • But what about attacks that were not seen before • Machine learning? INTRODUCTION
  • 5.
    Definition: intrusion &intrusion detection Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 5 INTRODUCTION “Intrusion is an attempt to compromise CIA (Confidentiality, Integrity, Availability), or to bypass the security mechanisms of a computer or network“ “Intrusion detection is the process of monitoring the events occurring in a computer system or network, and analyzing them for signs of intrusion”
  • 6.
    INTRUSION DETECTION METHODOLOGIES Georg-August-UniversitätGöttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 6 IDS: Intrusion Detection System
  • 7.
    There are 3main Detection Methodologies Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 7 • Signature-based Detection (SD) • A signature is a string or pattern that corresponds to known attack or threat • SD is a process to compare patterns against captured events for recognizing possible intrusions • Uses the knowledge accumulated by specific attacks and system vulnerabilities • Also known as Knowledge-based Detection or Misuse Detection • Anomaly-based Detection (AD) • Anomaly is a deviation to “normal” behavior • Profiles of normal derived from monitoring network traffic • AD compares normal profiles with observed events to recognize attacks • Stateful Protocol Analysis (SPA) • SPA depends on vendor-developed generic profiles to specific protocols • Protocols based on standards from international standard organizations • Hybrid IDS use multiple methodologies • SD and AD are complementary methods, former concerns with certain attacks and the later focuses on unknown attacks INTRUSION DETECTION METHODOLOGIES
  • 8.
    There are 3main Detection Methodologies Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 8 • Hybrid IDS use multiple methodologies • E.g. SD and AD are complementary methods • SD concerns with certain attacks and AD focuses on unknown attacks INTRUSION DETECTION METHODOLOGIES Signature-based Detection (SD)* Anomaly-based Detection (AD) Stateful Protocol Analysis (SPA) SD is a process to compare patterns against captured events for recognizing possible intrusions AD compares normal profiles with observed events to recognize attacks SPA depends on vendor-developed generic profiles to specific protocols A signature is a string or pattern that corresponds to known attack or threat Anomaly is a deviation to “normal” behavior The stateful in SPA indicates that IDS could know and trace the protocol states (e.g., pairing requests with replies) Uses the knowledge accumulated by specific attacks and system vulnerabilities Profiles of normal derived from monitoring network traffic Protocols based on standards from international standard organizations * Also known as Knowledge-based Detection or Misuse Detection
  • 9.
    Pros and consof Intrusion Detection Methods Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 9 INTRUSION DETECTION METHODOLOGIES Table 1: Pros and Cons of intrusion detection methodologies. Source [2] Signature-based Detection (SD) Anomaly-based Detection (AD) Stateful Protocol Analysis (SPA) • Simplest and effective method to detect attacks • Detail contextual analysis • Effective to detect new and unforeseen vulnerabilities • Less dependent on OS • Facilitate detections of privilege abuse • Know and trace protocol states • Distinguish unexpected sequences of commands • Ineffective with unknown attacks and variants of known attacks • Little understanding to states and protocols • Hard to keep signatures/patterns up to date • Time consuming to maintain the knowledge • Weak profiles accuracy due to observed events • Unavailable during rebuilding of behavior profiles • Difficult to trigger alerts in right time • Resource consuming to protocol state tracing and examination • Unable to inspect attacks looking like benign protocol behaviors • Might be incompatible to dedicated OSs or APs PROSCONS
  • 10.
    A MACHINE LEARNINGBASED IDS Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 10 IDS: Intrusion Detection System
  • 11.
    Machine learning inanomaly detection Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 11 • Anomaly-based Detection (AD) • Easy when it is possible to characterize what is normal in the data using simple mathematical model, e.g. normal distribution • Most interesting real world systems have complex behavior that doesn’t follow such distribution • Machine learning is useful to learn the characteristics of the system from observed data • Feature Selection is the process of selecting a subset of relevant features (variables, predictors) for use in model construction. Feature selection techniques are used for three reasons: • Simplification of models to make them easier to interpret • Shorter training times • Enhanced generalization by reducing overfitting • Outlier Detection: an outlier is an observation point that is distant from other observations A MACHINE LEARNING BASED IDS
  • 12.
    Robust Feature Selectionand Robust PCA for Internet Traffic Anomaly Detection Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 12 • Couples feature selection algorithm with outlier detection method • Uses robust statistics tools in both procedures • Reliable results even with outliers’ presence • Feature selection based on robust mutual estimator • MI (Mutual Information): an information-theoretic metric that captures both linear and non-linear dependencies • Outlier detection on robust PCA (Principal Component Analysis) • Mathematical procedure used to reduce dimensionality of a problem A MACHINE LEARNING BASED IDS
  • 13.
    Robust Feature Selectionand Robust PCA for Internet Traffic Anomaly Detection Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 13 • Feature selection • Important preprocessing step (filter) • Reduce dimensionality with high-dimensional data • Remove irrelevant data • Increase learning accuracy • Gives significant performance gains 
 A MACHINE LEARNING BASED IDS
  • 14.
    Robust Feature Selectionand Robust PCA for Internet Traffic Anomaly Detection Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 14 A MACHINE LEARNING BASED IDS • Robust statistics • Reliable results even in the presence of outliers Example: • In normal distribution, the inner 95% are in “center ± 1.96 X spread” • Center: instead of mean, 
 take the median • Spread: instead of SD (standard deviation), take the MAD (median absolute deviation) Source [1]
  • 15.
    Dataset creation fortraining and testing (1/2) Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 15 • Dataset collected from mirroring traffic passing the switch of: • Private laboratory network, 17 inter-connected PCs • 10 for users producing licit traffic • 1 for server, 1 for measurements • 5 for attacks • Licit traffic • File sharing (BitTorrent) • Video streaming (IPTV over TCP) • Web browsing (HTTP) • Attacks • Botnets • Port-scans: identify other targets vulnerable to infections • Snapshots: type of identity theft for stealing personal information • Other Botnet attacks are not used e.g. spyware, malware, denial of service, and email spam • Happen uniquely on host level • Can be detected by e.g. anti-virus, monitoring at router/firewalls, email scanning A MACHINE LEARNING BASED IDS
  • 16.
    Dataset creation fortraining and testing (2/2) Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 16 • Customer usage profiles • (a) Soft browsing (HTTP only) • (b) File sharing machine (BitTorrent only) • (c) File sharing user (BitTorrent and HTTP) • (d) Heavy user (HTTP, BitTorrent, and Streaming) • Network scenarios • (B) Business user • 100% (a) • (R) Residential user • 30% (b), 40% (c), 30% (d) • Attack intensities • (1) 6% (5% snapshot, 1% port-scan) • (2) 20% (15% snapshot, 5% port-scan) • (3) 35% (30% snapshot, 5% port-scan) A MACHINE LEARNING BASED IDS Table 2. Source [1]
  • 17.
    Results (1/3) Georg-August-Universität Göttingen––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 17 A MACHINE LEARNING BASED IDS • 6 types of anomaly detectors A-B • A: feature selection method, B Outlier detection method • R (robust) • NR (non-robust) • ∅ (no-method) • Performance measures • Nr Ftrs: number of selected features • Recall: probability that an observation is classified as anomaly when in fact it is an anomaly • False positive rate (FPR): probability that an observation is classified as an anomaly when in fact it is a regular observation • Precision: probability of having an anomalous observation given that it is classified as an anomaly Table 3. Source [1]
  • 18.
    Results (2/3) Georg-August-Universität Göttingen––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 18 • R-R detector achieved the best results • Recall is always 1 • B1, B2, B3, R3 performance is maximum • FPR and Precision are close to their optimal • Improvement over non-robust version is high • Low recall means large percentage of anomalies are not correctly identified • B2, B3, R3 recall improved from 0.167, 0.273, and 0.125 to 1 • Feature selection • Feature selection reduces Nr Ftrs, improves performance • B3 and R3: no feature selection sometimes better than non-robust feature selection A MACHINE LEARNING BASED IDS Table 3. Source [1]
  • 19.
    Results (3/3) Georg-August-Universität Göttingen––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 19 A MACHINE LEARNING BASED IDS • Compare R-NR (top) and R-R (bottom) • Any point with score or distance larger than a threshold (the lines) is considered an anomaly • R-NR case there is confusion around snapshots • thus poor recall value 0.125 • proximity in behavior between snapshots and some HTTP and BitTorrent fools the non-robust outlier detector • All consist of small file uploads Source [1] Fig. 2.
  • 20.
    Discussion Georg-August-Universität Göttingen –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––20 • There are advantages of using feature selection step and using robust statistics for both feature selection and outlier detection • System achieves very high performance • The system’s anomaly detector is adaptive to different traffic conditions (licit traffic differs significantly in the two scenarios) • However, the dataset used was obtained from a private lab with 17 PCs, and not necessarily representative of a real world scenario • Need to show proof of the effectiveness of the system in larger scale network traffic dataset A MACHINE LEARNING BASED IDS
  • 21.
    CHALLENGES OF USINGMACHINE LEARNING IN INTRUSION DETECTION Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 21
  • 22.
    Outliers, cost oferror, semantics, and evaluation Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 22 • Outlier detection • Hard to define normal in network traffic as the usage varies in every session and with new applications (diversity of network traffic) • High cost of errors • Cost of misclassification is extremely high • False positive: expensive analyst time • False negative: cause serious damage to an organization • Error in other applications of ML not expensive e.g. product recommendations, OCR, spam detection • Semantic gap • Currently it is only assessment of capability to identify deviations from normal profile (could be good or bad) • Need to interpret results from operator point of view, what does it mean? • Difficulties with evaluation • Designing sound evaluation schemes can be more difficult than the detector itself • Lack of public data sets for assessing anomaly detection • Hard to gain real data set for many reasons e.g. leak of personal data • Simulated data is not accurate CHALLENGES OF USING MACHINE LEARNING IN INTRUSION DETECTION
  • 23.
  • 24.
    Summary Georg-August-Universität Göttingen –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––24 • Introduction • The need for automated Intrusion Detection Systems • Definition of Intrusion and Intrusion Detection
 • Intrusion Detection Methodologies • Signature-based Detection (SD) • Anomaly-based Detection (AD) • Stateful Protocol Analysis (SPA)
 • Machine Learning Based IDS • Using feature selection and robust statistics • Dataset creation • Results and evaluation • Discussion
 • Challenges of Using Machine Learning in ID • Outlier detection, high cost of error, semantic gap, and difficulties with evaluation SUMMARY
  • 25.
    OMAR SHAYA ––––––––omar.shaya@stud.uni-goettingen.de Thanks! Georg-August-Universität Göttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 25
  • 26.
    Georg-August-Universität Göttingen –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––26 References [1] C. Pasocal, M. Oliveira, R. Valdas, P. Filzmoser, P. Salvador and A. Pacheco. Robust Feature Selection and Robust PCA for Internet Traffic Anomaly Detection. In Proceedings IEEE INFOCOM, pages 1755-1763, 2012 [2] H. Liao, C. Lin, Y. Lin and K. Tung. Intrusion Detection System: A Comprehensive Review. In Journal of Network and Computer Applications, pages 16-24, 2013 [3] R. Sommer and V. Paxson. Outside the Closed World: On Using Machine Learning For Network Intrusion Detection. In IEEE Symposium on Security and Privacy, pages 305-316, 2010 [4] Feature Selection. https://en.wikipedia.org/wiki/Feature_selection on 6 August 2015 [5] Outlier. https://en.wikipedia.org/wiki/Outlier on 6 August 2015 [6] Anomaly Detection – Using Machine Learning to Detect Abnormalities in Time Series Data. http:// blogs.technet.com/b/machinelearning/archive/2014/11/05/anomaly-detection-using-machine-learning-to- detect-abnormalities-in-time-series-data.aspx on 6 August 2015 REFERENCES
  • 27.
    Precision and Recall Georg-August-UniversitätGöttingen ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 27 APPENDIX Source: Dr. Stephan Sigg’s slides from Machine Learning and Pervasive Computing course SoSe 2015