KEMBAR78
How Graphs are Changing AI | PPTX
1
Graphs & AI
A Path for Enterprise Data Science
Amy Hodler @amyhodler
Director, Graph Analytics & AI Programs
Neo4j
Relationships
The Strongest Predictors of Behavior!
“Increasingly we're learning that you can make
better predictions about people by getting all the
information from their friends and their friends’
friends than you can from the information you have
about the person themselves”
James Fowler
11
Predicting Financial Contagion
From Global to Local
12
Graph Is Accelerating AI Innovation
13
4,000
3,000
2,000
1,000
0
2010 2011 2012 2013 2014 2015 2016 2017 2018
Graph Technology
Mentioned
graph neural network
graph convolutional
graph embedding
graph learning
graph attention
graph kernel
graph completion
AI Research Papers Featuring Graph
Source: Dimension Knowledge System
Predictive
Maintenance
Churn
Prediction
Fraud
Detection
Life SciencesRecommendations
Cybersecurity
Customer
Segmentation
Search/MDM
Graph Data Science Applications
Better Predictions with Graphs
Using the Data You Already Have
• Current data science models ignore network structure
• Graphs add highly predictive features to ML models, increasing accuracy
• Otherwise unattainable predictions based on relationships
Machine Learning Pipeline
15
Goals of Graph Data Science
Better
Decisions
Higher
Accuracy
New Learning
and More Trust
16
Decision
Support
Graph Based
Prediction
Graph Native
Learning
The Path of Graph Data Science
Decision Support Graph Based
Prediction
Graph Native Learning
17
Graph Feature
Engineering
Graph
Embeddings
Graph Neural
Networks
Knowledge
Graphs
Graph
Analytics
The Path of Graph Data Science
Graph Feature
Engineering
Graph
Embeddings
Graph Neural
Networks
18
Graph
AnalyticsKnowledge
Graphs
Graph search
and queries
Support domain
experts
Knowledge Graph with Queries
Connecting the Dots has become...
19
Multiple graph layers of financial information
Includes corporate data with cross-relationships and external news
Knowledge Graph with Queries
Connecting the Dots
Dashboards and tools
• Credit risk
• Investment risk
• Portfolio news recommendations
• Typical analyst portfolio is 200 companies
• Custom relative weights
1 Week Snapshot:
800,000 shortest path calculations for the ranked
newsfeed. Each calculation optimized to take
approximately 10 ms.
has become...
20
The Path of Graph Data Science
Graph Feature
Engineering
Graph
Embeddings
Graph Neural
Networks
21
Knowledge
Graphs
Graph
Analytics
Graph queries &
algorithms for offline
analysis
Understanding
Structures
Query
(e.g. Cypher/Python)
Fast, local decisioning
and pattern matching
Graph Algorithms
(e.g. Neo4j library, GraphX)
Global analysis
and iterations
You know what you’re looking
for and
making a decision
You’re learning the overall structure of a
network, updating data, and predicting
Local Patterns Global Computation
22
Deceptively Simple Queries
How many flagged accounts are in the applicant’s
network 4+ hops out?
How many login / account variables in
common?
Add these metrics to your approval process
Difficult for RDMS systems over 3 hops
Graph Analytics via Queries
Detecting Financial Fraud
Improving existing pipelines to identify fraud via heuristics
23
Graph Analytics via Algorithms
Generally Unsupervised
24
A subset of data science algorithms that come from network science,
Graph Algorithms enable reasoning about network structure.
Pathfinding
and Search
Centrality
(Importance)
Community Detection Heuristic
Link Prediction
Similarity
• Euclidean Distance
• Cosine Similarity
• Jaccard Similarity
• Overlap Similarity
• Pearson Similarity
• Approximate KNN
• Degree Centrality
• Closeness Centrality
• CC Variations: Harmonic, Dangalchev,
Wasserman & Faust
• Betweenness Centrality
• Approximate Betweenness Centrality
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count
• Clustering Coefficients
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity
• Balanced Triad (identification)
+45 Graph Algorithms in Neo4j
• Parallel Breadth First Search
• Parallel Depth First Search
• Shortest Path
• Single-Source Shortest Path
• All Pairs Shortest Path
• Minimum Spanning Tree
• A* Shortest Path
• Yen’s K Shortest Path
• K-Spanning Tree (MST)
• Random Walk
• Degree Centrality
• Closeness Centrality
• CC Variations: Harmonic, Dangalchev,
Wasserman & Faust
• Betweenness Centrality
• Approximate Betweenness Centrality
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count
• Clustering Coefficients
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity
• Balanced Triad (identification)
• Euclidean Distance
• Cosine Similarity
• Jaccard Similarity
• Overlap Similarity
• Pearson Similarity
• Approximate KNN
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Similarity
Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors25
There is significant demand for graph
algorithms. Neo4j will be the first
enterprise grade way to run them.
The Path of Graph Data Science
Graph
Embeddings
Graph Neural
Networks
26
Knowledge
Graphs
Graph
Analytics
Graph Feature
Engineering
Graph algorithms &
queries for machine
learning
Improve Prediction
Accuracy
Graph Feature Engineering
Feature Engineering is how we combine and process the data to create
new, more meaningful features, such as clustering or connectivity
metrics.
Graph features add more dimensions to machine
learning
EXTRACTION
27
Feature Engineering using Graph Queries
Telecom-churn prediction
Churn prediction research has found
that simple hand-engineered features
are highly predictive
• How many calls/texts has an
account made?
• How many of their contacts have
churned?
30
Feature Engineering using Graph Queries
Telecom-churn prediction
Add connected features based on graph queries to tabular data
Raw Data:
Call Detail Records
Input Data:
CDR Sample
Call Stats by: Incoming
Outgoing
Per day
Short durations
In-network
Centrality
SMS’s
…
Test/Training Data
Caller ID
Receiver ID
Time
Duration
Location
…
Caller ID
Receiver ID
Time
Duration
Location
…
Identify Early Predictors:
Select simple, interpretable metrics that are
highly correlated w/churn
Churn Score:
Supervised learning to predict binary &
continuous measures of churn
Output/Results
Random
Sample
Selection
Feature
Engineering
31
Feature Engineering using Graph Queries
Telecom-churn prediction
89.4% Accuracy in Subscriber
Churn Prediction
Raw Data:
Call Detail Records
Input Data:
CDR Sample
Call Stats by: Incoming
Outgoing
Per day
Short durations
In-network
Centrality
SMS’s
…
Test/Training Data
Caller ID
Receiver ID
Time
Duration
Location
…
Caller ID
Receiver ID
Time
Duration
Location
…
Identify Early Predictors:
Select simple, interpretable metrics that are
highly correlated w/churn
Churn Score:
Supervised learning to predict binary &
continuous measures of churn
Output/Results
Random
Sample
Selection
Feature
Engineering
Source: Behavioral Modeling for Churn Prediction by Khan et al, 2015
Feature Engineering using Graph Algorithms
Detecting Financial Fraud
Using Structure to
Improve ML Predictions
Connected components
identify disjointed group sharing identifiers
PageRank to measure influence and
transaction volumes
Louvain to identify communities that
frequently interact
Jaccard to measure account similarity
The Path of Graph Data Science
Graph Feature
Engineering
Graph Neural
Networks
33
Knowledge
Graphs
Graph
Analytics
Graph
Embeddings
Graph embedding
algorithms for
ML features
Predictions on complex
structures
Embedding transforms graphs into a feature vector, or set of vectors, describing
topology, connectivity, or attributes of nodes
and relationships in the graph
Graph Embeddings
• Node embeddings: describe connectivity of each node
• Path embeddings: traversals across the graph
• Graph embeddings: encode an entire graph into a single vector
Phases of Deep Walk Approach
34
Graph Embeddings RECOMMENDATIONS
Explainable Reasoning over
Knowledge Graphs for Recommendations
35
Pop
Folk
Castle on the Hill
÷ Album
Ed Sheeran
I See FireTony
Shape of You
SungBy IsSingerOf
Interact
Produce
WrittenBy
Derek
Recommendations for
Derek
0.06
0.24
0.24
0.26
0.03
0.30
.63
The Path of Graph Data Science
Graph Feature
Engineering
Graph
Embeddings
36
Knowledge
Graphs
Graph
Analytics
Graph Neural
Networks
ML within a Graph
New learning methods
“Graphs bring an ability to generalize about
structure that the individual neural nets don't have.”
don't have.”
Next Major Advancement in AI: Graph Native Learning
Next Major Advancement in AI: Graph Native Learning
38
Implements machine learning in a graph environment
Input data as
a graph
Learns while
preserving transient
states
Output as
a graph
Track and validate AI
decision paths
More accurate with less
data and training
The Path of Graph Data Science
Decision Support Graph Based
Prediction
Graph Native Learning
39
Graph Feature
Engineering
Graph
Embeddings
Graph Neural
Networks
Knowledge
Graphs
Graph
Analytics
Resources
Business – AI Whitepaper
neo4j.com/use-cases/
artificial-intelligence-analytics/
Data Scientists
neo4j.com/sandbox
Developers
neo4j.com/download
neo4j.com/graph-algorithms-book
One Thing
43
“AI is not all about Machine Learning.
Context, structure, and reasoning are
necessary ingredients, and Knowledge
Graphs and Linked Data are key
technologies for this.”
Wais Bashir
Managing Editor, Onyx Advisory
44
Graphs & AI
A Path for Enterprise Data Science
Amy Hodler @amyhodler
Director, Graph Analytics & AI Programs
Neo4j
Graph Data Science
take your analytics one step further
45

How Graphs are Changing AI

  • 1.
    1 Graphs & AI APath for Enterprise Data Science Amy Hodler @amyhodler Director, Graph Analytics & AI Programs Neo4j
  • 3.
    Relationships The Strongest Predictorsof Behavior! “Increasingly we're learning that you can make better predictions about people by getting all the information from their friends and their friends’ friends than you can from the information you have about the person themselves” James Fowler 11
  • 4.
  • 5.
    Graph Is AcceleratingAI Innovation 13 4,000 3,000 2,000 1,000 0 2010 2011 2012 2013 2014 2015 2016 2017 2018 Graph Technology Mentioned graph neural network graph convolutional graph embedding graph learning graph attention graph kernel graph completion AI Research Papers Featuring Graph Source: Dimension Knowledge System
  • 6.
  • 7.
    Better Predictions withGraphs Using the Data You Already Have • Current data science models ignore network structure • Graphs add highly predictive features to ML models, increasing accuracy • Otherwise unattainable predictions based on relationships Machine Learning Pipeline 15
  • 8.
    Goals of GraphData Science Better Decisions Higher Accuracy New Learning and More Trust 16 Decision Support Graph Based Prediction Graph Native Learning
  • 9.
    The Path ofGraph Data Science Decision Support Graph Based Prediction Graph Native Learning 17 Graph Feature Engineering Graph Embeddings Graph Neural Networks Knowledge Graphs Graph Analytics
  • 10.
    The Path ofGraph Data Science Graph Feature Engineering Graph Embeddings Graph Neural Networks 18 Graph AnalyticsKnowledge Graphs Graph search and queries Support domain experts
  • 11.
    Knowledge Graph withQueries Connecting the Dots has become... 19 Multiple graph layers of financial information Includes corporate data with cross-relationships and external news
  • 12.
    Knowledge Graph withQueries Connecting the Dots Dashboards and tools • Credit risk • Investment risk • Portfolio news recommendations • Typical analyst portfolio is 200 companies • Custom relative weights 1 Week Snapshot: 800,000 shortest path calculations for the ranked newsfeed. Each calculation optimized to take approximately 10 ms. has become... 20
  • 13.
    The Path ofGraph Data Science Graph Feature Engineering Graph Embeddings Graph Neural Networks 21 Knowledge Graphs Graph Analytics Graph queries & algorithms for offline analysis Understanding Structures
  • 14.
    Query (e.g. Cypher/Python) Fast, localdecisioning and pattern matching Graph Algorithms (e.g. Neo4j library, GraphX) Global analysis and iterations You know what you’re looking for and making a decision You’re learning the overall structure of a network, updating data, and predicting Local Patterns Global Computation 22
  • 15.
    Deceptively Simple Queries Howmany flagged accounts are in the applicant’s network 4+ hops out? How many login / account variables in common? Add these metrics to your approval process Difficult for RDMS systems over 3 hops Graph Analytics via Queries Detecting Financial Fraud Improving existing pipelines to identify fraud via heuristics 23
  • 16.
    Graph Analytics viaAlgorithms Generally Unsupervised 24 A subset of data science algorithms that come from network science, Graph Algorithms enable reasoning about network structure. Pathfinding and Search Centrality (Importance) Community Detection Heuristic Link Prediction Similarity
  • 17.
    • Euclidean Distance •Cosine Similarity • Jaccard Similarity • Overlap Similarity • Pearson Similarity • Approximate KNN • Degree Centrality • Closeness Centrality • CC Variations: Harmonic, Dangalchev, Wasserman & Faust • Betweenness Centrality • Approximate Betweenness Centrality • PageRank • Personalized PageRank • ArticleRank • Eigenvector Centrality • Triangle Count • Clustering Coefficients • Connected Components (Union Find) • Strongly Connected Components • Label Propagation • Louvain Modularity • Balanced Triad (identification) +45 Graph Algorithms in Neo4j • Parallel Breadth First Search • Parallel Depth First Search • Shortest Path • Single-Source Shortest Path • All Pairs Shortest Path • Minimum Spanning Tree • A* Shortest Path • Yen’s K Shortest Path • K-Spanning Tree (MST) • Random Walk • Degree Centrality • Closeness Centrality • CC Variations: Harmonic, Dangalchev, Wasserman & Faust • Betweenness Centrality • Approximate Betweenness Centrality • PageRank • Personalized PageRank • ArticleRank • Eigenvector Centrality • Triangle Count • Clustering Coefficients • Connected Components (Union Find) • Strongly Connected Components • Label Propagation • Louvain Modularity • Balanced Triad (identification) • Euclidean Distance • Cosine Similarity • Jaccard Similarity • Overlap Similarity • Pearson Similarity • Approximate KNN Pathfinding & Search Centrality / Importance Community Detection Similarity Link Prediction • Adamic Adar • Common Neighbors • Preferential Attachment • Resource Allocations • Same Community • Total Neighbors25 There is significant demand for graph algorithms. Neo4j will be the first enterprise grade way to run them.
  • 18.
    The Path ofGraph Data Science Graph Embeddings Graph Neural Networks 26 Knowledge Graphs Graph Analytics Graph Feature Engineering Graph algorithms & queries for machine learning Improve Prediction Accuracy
  • 19.
    Graph Feature Engineering FeatureEngineering is how we combine and process the data to create new, more meaningful features, such as clustering or connectivity metrics. Graph features add more dimensions to machine learning EXTRACTION 27
  • 20.
    Feature Engineering usingGraph Queries Telecom-churn prediction Churn prediction research has found that simple hand-engineered features are highly predictive • How many calls/texts has an account made? • How many of their contacts have churned?
  • 21.
    30 Feature Engineering usingGraph Queries Telecom-churn prediction Add connected features based on graph queries to tabular data Raw Data: Call Detail Records Input Data: CDR Sample Call Stats by: Incoming Outgoing Per day Short durations In-network Centrality SMS’s … Test/Training Data Caller ID Receiver ID Time Duration Location … Caller ID Receiver ID Time Duration Location … Identify Early Predictors: Select simple, interpretable metrics that are highly correlated w/churn Churn Score: Supervised learning to predict binary & continuous measures of churn Output/Results Random Sample Selection Feature Engineering
  • 22.
    31 Feature Engineering usingGraph Queries Telecom-churn prediction 89.4% Accuracy in Subscriber Churn Prediction Raw Data: Call Detail Records Input Data: CDR Sample Call Stats by: Incoming Outgoing Per day Short durations In-network Centrality SMS’s … Test/Training Data Caller ID Receiver ID Time Duration Location … Caller ID Receiver ID Time Duration Location … Identify Early Predictors: Select simple, interpretable metrics that are highly correlated w/churn Churn Score: Supervised learning to predict binary & continuous measures of churn Output/Results Random Sample Selection Feature Engineering Source: Behavioral Modeling for Churn Prediction by Khan et al, 2015
  • 23.
    Feature Engineering usingGraph Algorithms Detecting Financial Fraud Using Structure to Improve ML Predictions Connected components identify disjointed group sharing identifiers PageRank to measure influence and transaction volumes Louvain to identify communities that frequently interact Jaccard to measure account similarity
  • 24.
    The Path ofGraph Data Science Graph Feature Engineering Graph Neural Networks 33 Knowledge Graphs Graph Analytics Graph Embeddings Graph embedding algorithms for ML features Predictions on complex structures
  • 25.
    Embedding transforms graphsinto a feature vector, or set of vectors, describing topology, connectivity, or attributes of nodes and relationships in the graph Graph Embeddings • Node embeddings: describe connectivity of each node • Path embeddings: traversals across the graph • Graph embeddings: encode an entire graph into a single vector Phases of Deep Walk Approach 34
  • 26.
    Graph Embeddings RECOMMENDATIONS ExplainableReasoning over Knowledge Graphs for Recommendations 35 Pop Folk Castle on the Hill ÷ Album Ed Sheeran I See FireTony Shape of You SungBy IsSingerOf Interact Produce WrittenBy Derek Recommendations for Derek 0.06 0.24 0.24 0.26 0.03 0.30 .63
  • 27.
    The Path ofGraph Data Science Graph Feature Engineering Graph Embeddings 36 Knowledge Graphs Graph Analytics Graph Neural Networks ML within a Graph New learning methods
  • 28.
    “Graphs bring anability to generalize about structure that the individual neural nets don't have.” don't have.” Next Major Advancement in AI: Graph Native Learning
  • 29.
    Next Major Advancementin AI: Graph Native Learning 38 Implements machine learning in a graph environment Input data as a graph Learns while preserving transient states Output as a graph Track and validate AI decision paths More accurate with less data and training
  • 30.
    The Path ofGraph Data Science Decision Support Graph Based Prediction Graph Native Learning 39 Graph Feature Engineering Graph Embeddings Graph Neural Networks Knowledge Graphs Graph Analytics
  • 31.
    Resources Business – AIWhitepaper neo4j.com/use-cases/ artificial-intelligence-analytics/ Data Scientists neo4j.com/sandbox Developers neo4j.com/download neo4j.com/graph-algorithms-book
  • 32.
  • 33.
    43 “AI is notall about Machine Learning. Context, structure, and reasoning are necessary ingredients, and Knowledge Graphs and Linked Data are key technologies for this.” Wais Bashir Managing Editor, Onyx Advisory
  • 34.
    44 Graphs & AI APath for Enterprise Data Science Amy Hodler @amyhodler Director, Graph Analytics & AI Programs Neo4j
  • 35.
    Graph Data Science takeyour analytics one step further 45