KEMBAR78
Data Science Transforming Security Operations | PDF
SESSION ID:
#RSAC
Dr. Alon Kaufman
Data Science Transforming
Security Operations
STR-R02
Director of Data Science &
Innovation, RSA
#RSAC
Data Science & Security Operation?
2
Who uses data science in their security practice?
In what processes throughout your security operations do you use data science?
Have you seen a significant value come out of your data science solutions?
Do you see data science playing in role in the Cybersecurity market shift: ā€œBy 2020,
60% of enterprise information security budgets will be allocated for rapid detection
and response approaches, up from less than 20% in 2015 (Gartner) ā€
Data Science has way more to offer than prevention & detection... It
can and should be used as a key methodology and technology
spanning all processes in security operations….
#RSAC
Agenda
3
What is data science, and why in security?
You should know by now ;)
What's special about data science in security
5 Maturity levels of data science in security operations
Data science goes way beyond the prevention & detection in the entry level…
DS maturity survey
Where is your organization/product in terms of DS maturity?
Building a security data science practice in house, Yes or No?
Summary
#RSAC
What is Data Science – in 1 Sentence
Making sense out of big data…
Getting the data we collect to work for us!
4
The demand is
just growing…
Ratio
#RSAC
Why Data Science in Security?
5
We have all (most) of the data already….. Yet still being breached… while the attacks are
hidden in our data
Security operations are getting too complex for humans alone… and we are facing a huge
staffing gap…
Other industries demonstrated huge value with DS, given a hard problem and the relevant
data at hand:
Retail recommendation systems, up-sells, cross-sell
Bio-informatics
Image object recognition
Voice recognition
Self driving cars
…
#RSAC
What's Special About Data Science in Security?
6
Dealing with a hostile dynamic world!
Human/Machine synergy
High price of False-Negative errors
Gathering/Sharing data
Lack of labeled attacks for training and learning
In security detection is just the beginning….
#RSAC
5 Levels of Data Science Maturity
7
•Known bad
•Adaptive learning
•Integrated scoring
•Aggregate
•Prioritize
•Automate &
Recommend
•Basic feedback
•Derived feedback
•Learning from
analyst actions
•IoCs
•Global learning
•Policies
Key message: Data science is a key methodology and technology, not a plug-in feature…
•Limit
•Block
•User-support
#RSAC
Detection: The Holy Grail of Data Science….
8
The data exists, and so also endless point
solutions for detection
The key to success is:
Risk
Known
Bad
Patterns
Behavior
Anomaly
Entity
Anomaly
Compressive Risk Scoring Integrated Approach
#RSAC
Comprehensive Risk Score - Example
Suspicious User Login Detection
Multivariate Machine Learning algorithm to detect login
impersonation
Multiple inputs from multiple sources:
Hostname, location, server, duration, auth, time of day, data tx/rx,….
Model output
Risk score (combined measure of how risky the behavior is)
Modeling concept:
Known bad: blocked users, unrealistic ground-speed, authentication
User anomaly: base line per feature and detect deviation from norm
Peer group anomaly: Prior knowledge, new user, acceptable
behavior changes
#RSAC
Integrating Different Approaches - Example
Endpoint Malware Detection
The market is highly fragmented with endless point
solutions
Each vendor/solution takes a different valid
approach with pros and cons
Combining them provides enhanced performance:
Human
Static analysis
Dynamic analysis
Community reputation
10
#RSAC
Augmented Investigation
11
The goal is not replace the analysts but augment them and simplify
their work:
Shortage of cybersecurity skills continues to grow
Most of analysts’ time goes on selecting what alerts to investigate
Attacks typically trigger multiple alerts throughout the different
attack phases
70% of the procedures done by analysts are repeatable
The Key to success:
Prioritize
Aggregate
Automate & Recommendation
23% 25% 28%
46%
2013 2014 2015 2016
Shortage in
CyberSecurity Skills
(ESG, 2016)
#RSAC
Augmented Investigation - Example
Top-down Hierarchical approach
Pre-fetch all supporting data
Risk scoring prioritization
Aggregate across entities (user,
devices, application, …)
Moving from alerts to attack
vectors
Guide the analyst with
recommendations
12
#RSAC
Continuous Learning
13
As in any learning ā€œteachersā€ are beneficial –
supervised learning
Feeding back results to the learning engine
When direct feedback is lacking it can be
derived
Learning from analyst behavior and actions
#RSAC
Leaning and Self-Improving Detection - Example
Ongoing, automatic self-learning fraud detection model
Risk Engine
Case
Mgmt
Activity details
Policy
Mgr.
Device Payee
Authenticate Continue
Step-up AuthenticationFeedback
Feedback
Challenge
Out-of-band
Others
Knowledge
271937
Deny
User
Data Science based Risk
Engine
Account
#RSAC
Intelligence Sharing
15
Tiny part of the road from
each
Analytics
Map + prediction + navigation
instruction
Waze. Outsmarting traffic, Together.
Crowdsourced security intel’
Security map + predictions +
mitigation instructions
To date the industry state of the art
sharing is around IoCs, next phase is
to share, learn and crowdsource
policies, procedures & mitigations
#RSAC
Fighting Back Together - Example
#RSAC
Response
17
Taking automatic actions based on insights:
Limit access / Require additional input
Risk based authentication
Partial blocking
Automatic blocking
Guide the analyst through investigation
Pre-fetch all required data
Recommend next action
#RSAC
5 Levels of Data Science Maturity
18
•Known bad
•Adaptive learning
•Integrated scoring
•Aggregate
•Prioritize
•Automate &
Recommend
•Basic feedback
•Derived feedback
•Learning from
analyst actions
•IoCs
•Global learning
•Policies
Key message: Data science is a key methodology and technology, not a plug-in feature…
•Limit
•Block
•User-support
#RSAC
Survey: How DS-Mature Are Your Operations?
(How many fields? (5), Overall score? (22 points) )
19
Detection Augmented
Investigation
Continuous
Learning
Intelligence
Sharing
 Do you use
advanced,
adaptive, analytics
for detection?
 Can you bake into
the analytics
engines your
human insights?
 Do you have your
various products
integrated at the
analytics level?
Response
 Can you combine
multiple alerts
into some attack
description?
 Do you have one
integrated priority
queue?
 Do you utilize
automatic
enrichments,
hints, guidance or
recommendation
to assist analysts?
 Do you leverage
analysts decision
for operations
improvement?
 Do you have any
level of automatic,
self learning from
feedback?
 Do your overall
operations
improve based on
your analysts
work?
 Do you utilize
community data
to improve
operations?
 Do your systems
ā€œlearnā€ from data
outside of your
system?
 Do you have a
mechanism to
improve human
actions based on
the community?
 Do you use
automatic
response based
on analytics?
 Are any
decisions or
actions fed back
to analysts as a
results of the
risk?
#RSAC
Building a Security Data Science Practice in
House, Yes or No?
20
Applying Data Science requires joint effort between data
scientists, security experts and the business owners
To date hiring people with a data science background is hard,
nevertheless with security domain knowledge
From research to an operational process/product – long
journey from the proof-of-signal to an operational system
Data, Data, Data….
You don’t want data science… you actually want data science
backed into your solution in an intuitive, easy to use manner
Alignment from
stakeholders
Invest in staffing and
diverse backgrounds
Organization &
operational breadth
Collaborate / share
Integrated home
grown solution
#RSAC
Applying What You Have Learned Today
21
Take the survey and assess how advanced is your DS strategy
Identify gaps, and in what area focus is needed
Work up the DS stairs:
Detection -> Investigation -> continuous learning -> Intl Sharing -> Automatic response
(Risk based response)
Data Science in house:
Alignment cross-org
Staff wisely
Be prepared for a long (and expensive) journey
Constantly strive to see how DS augments your analysts, and not try replace them!
#RSAC
Summary
22
Data Science has way more to offer than prevention & detection ...
It can and should be used as a key methodology and technology
spanning all processes in security operations…
SESSION ID:
#RSAC
Dr. Alon Kaufman
Data Science Transforming
Security Operations
STR-R02
Director of Data Science &
Innovation, RSA
Alon.Kaufman@rsa.com

Data Science Transforming Security Operations

  • 1.
    SESSION ID: #RSAC Dr. AlonKaufman Data Science Transforming Security Operations STR-R02 Director of Data Science & Innovation, RSA
  • 2.
    #RSAC Data Science &Security Operation? 2 Who uses data science in their security practice? In what processes throughout your security operations do you use data science? Have you seen a significant value come out of your data science solutions? Do you see data science playing in role in the Cybersecurity market shift: ā€œBy 2020, 60% of enterprise information security budgets will be allocated for rapid detection and response approaches, up from less than 20% in 2015 (Gartner) ā€ Data Science has way more to offer than prevention & detection... It can and should be used as a key methodology and technology spanning all processes in security operations….
  • 3.
    #RSAC Agenda 3 What is datascience, and why in security? You should know by now ;) What's special about data science in security 5 Maturity levels of data science in security operations Data science goes way beyond the prevention & detection in the entry level… DS maturity survey Where is your organization/product in terms of DS maturity? Building a security data science practice in house, Yes or No? Summary
  • 4.
    #RSAC What is DataScience – in 1 Sentence Making sense out of big data… Getting the data we collect to work for us! 4 The demand is just growing… Ratio
  • 5.
    #RSAC Why Data Sciencein Security? 5 We have all (most) of the data already….. Yet still being breached… while the attacks are hidden in our data Security operations are getting too complex for humans alone… and we are facing a huge staffing gap… Other industries demonstrated huge value with DS, given a hard problem and the relevant data at hand: Retail recommendation systems, up-sells, cross-sell Bio-informatics Image object recognition Voice recognition Self driving cars …
  • 6.
    #RSAC What's Special AboutData Science in Security? 6 Dealing with a hostile dynamic world! Human/Machine synergy High price of False-Negative errors Gathering/Sharing data Lack of labeled attacks for training and learning In security detection is just the beginning….
  • 7.
    #RSAC 5 Levels ofData Science Maturity 7 •Known bad •Adaptive learning •Integrated scoring •Aggregate •Prioritize •Automate & Recommend •Basic feedback •Derived feedback •Learning from analyst actions •IoCs •Global learning •Policies Key message: Data science is a key methodology and technology, not a plug-in feature… •Limit •Block •User-support
  • 8.
    #RSAC Detection: The HolyGrail of Data Science…. 8 The data exists, and so also endless point solutions for detection The key to success is: Risk Known Bad Patterns Behavior Anomaly Entity Anomaly Compressive Risk Scoring Integrated Approach
  • 9.
    #RSAC Comprehensive Risk Score- Example Suspicious User Login Detection Multivariate Machine Learning algorithm to detect login impersonation Multiple inputs from multiple sources: Hostname, location, server, duration, auth, time of day, data tx/rx,…. Model output Risk score (combined measure of how risky the behavior is) Modeling concept: Known bad: blocked users, unrealistic ground-speed, authentication User anomaly: base line per feature and detect deviation from norm Peer group anomaly: Prior knowledge, new user, acceptable behavior changes
  • 10.
    #RSAC Integrating Different Approaches- Example Endpoint Malware Detection The market is highly fragmented with endless point solutions Each vendor/solution takes a different valid approach with pros and cons Combining them provides enhanced performance: Human Static analysis Dynamic analysis Community reputation 10
  • 11.
    #RSAC Augmented Investigation 11 The goalis not replace the analysts but augment them and simplify their work: Shortage of cybersecurity skills continues to grow Most of analysts’ time goes on selecting what alerts to investigate Attacks typically trigger multiple alerts throughout the different attack phases 70% of the procedures done by analysts are repeatable The Key to success: Prioritize Aggregate Automate & Recommendation 23% 25% 28% 46% 2013 2014 2015 2016 Shortage in CyberSecurity Skills (ESG, 2016)
  • 12.
    #RSAC Augmented Investigation -Example Top-down Hierarchical approach Pre-fetch all supporting data Risk scoring prioritization Aggregate across entities (user, devices, application, …) Moving from alerts to attack vectors Guide the analyst with recommendations 12
  • 13.
    #RSAC Continuous Learning 13 As inany learning ā€œteachersā€ are beneficial – supervised learning Feeding back results to the learning engine When direct feedback is lacking it can be derived Learning from analyst behavior and actions
  • 14.
    #RSAC Leaning and Self-ImprovingDetection - Example Ongoing, automatic self-learning fraud detection model Risk Engine Case Mgmt Activity details Policy Mgr. Device Payee Authenticate Continue Step-up AuthenticationFeedback Feedback Challenge Out-of-band Others Knowledge 271937 Deny User Data Science based Risk Engine Account
  • 15.
    #RSAC Intelligence Sharing 15 Tiny partof the road from each Analytics Map + prediction + navigation instruction Waze. Outsmarting traffic, Together. Crowdsourced security intel’ Security map + predictions + mitigation instructions To date the industry state of the art sharing is around IoCs, next phase is to share, learn and crowdsource policies, procedures & mitigations
  • 16.
  • 17.
    #RSAC Response 17 Taking automatic actionsbased on insights: Limit access / Require additional input Risk based authentication Partial blocking Automatic blocking Guide the analyst through investigation Pre-fetch all required data Recommend next action
  • 18.
    #RSAC 5 Levels ofData Science Maturity 18 •Known bad •Adaptive learning •Integrated scoring •Aggregate •Prioritize •Automate & Recommend •Basic feedback •Derived feedback •Learning from analyst actions •IoCs •Global learning •Policies Key message: Data science is a key methodology and technology, not a plug-in feature… •Limit •Block •User-support
  • 19.
    #RSAC Survey: How DS-MatureAre Your Operations? (How many fields? (5), Overall score? (22 points) ) 19 Detection Augmented Investigation Continuous Learning Intelligence Sharing  Do you use advanced, adaptive, analytics for detection?  Can you bake into the analytics engines your human insights?  Do you have your various products integrated at the analytics level? Response  Can you combine multiple alerts into some attack description?  Do you have one integrated priority queue?  Do you utilize automatic enrichments, hints, guidance or recommendation to assist analysts?  Do you leverage analysts decision for operations improvement?  Do you have any level of automatic, self learning from feedback?  Do your overall operations improve based on your analysts work?  Do you utilize community data to improve operations?  Do your systems ā€œlearnā€ from data outside of your system?  Do you have a mechanism to improve human actions based on the community?  Do you use automatic response based on analytics?  Are any decisions or actions fed back to analysts as a results of the risk?
  • 20.
    #RSAC Building a SecurityData Science Practice in House, Yes or No? 20 Applying Data Science requires joint effort between data scientists, security experts and the business owners To date hiring people with a data science background is hard, nevertheless with security domain knowledge From research to an operational process/product – long journey from the proof-of-signal to an operational system Data, Data, Data…. You don’t want data science… you actually want data science backed into your solution in an intuitive, easy to use manner Alignment from stakeholders Invest in staffing and diverse backgrounds Organization & operational breadth Collaborate / share Integrated home grown solution
  • 21.
    #RSAC Applying What YouHave Learned Today 21 Take the survey and assess how advanced is your DS strategy Identify gaps, and in what area focus is needed Work up the DS stairs: Detection -> Investigation -> continuous learning -> Intl Sharing -> Automatic response (Risk based response) Data Science in house: Alignment cross-org Staff wisely Be prepared for a long (and expensive) journey Constantly strive to see how DS augments your analysts, and not try replace them!
  • 22.
    #RSAC Summary 22 Data Science hasway more to offer than prevention & detection ... It can and should be used as a key methodology and technology spanning all processes in security operations…
  • 23.
    SESSION ID: #RSAC Dr. AlonKaufman Data Science Transforming Security Operations STR-R02 Director of Data Science & Innovation, RSA Alon.Kaufman@rsa.com