KEMBAR78
4. Document Discovery with Graph Data Science | PDF
Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
Document Discovery with Graph
Data Science
Gary Mann
gary.mann@neo4j.com
Neo4j, Inc. All rights reserved 2021
● Customers are already very good at search.
○ Too much data, too many tools.
○ Data is highly variable - schemaless.
○ Need to focus analysts.
○ Start with unstructured search - then traverse and discover paths
through data.
● Customers want to:
○ Leverage both structured and unstructured data.
○ Support/discover multiple relationships between pieces of information.
○ Understand how entities interact
○ Navigate/Traverse across both structured and unstructured data
visually.
Graph-Aided Discovery
Neo4j, Inc. All rights reserved 2021
Graph-Aided Discovery
● Why Graph?
○ Flexible and (mostly) schemaless model
○ Unstructured queries to enter the graph
○ Queries and pattern matching to traverse relationships.
○ Pathfinding algorithms to understand how entities are
related.
○ Graph Analytics to leverage the overall graph structure.
○ Graphs can form the basis for many analytical use-cases
including Discovery, Analytics, Investigations, etc.
Neo4j, Inc. All rights reserved 2021
Query (e.g. Cypher/Python)
Real-time, local decisioning
and pattern matching
Graph Algorithms
Global analysis
and iterations
You know what you’re
looking for and making a
decision
You’re learning the overall structure
of a network, updating data, and
predicting
Local
Patterns
Global
Computation
When Do I need Graph Algorithms?
Neo4j, Inc. All rights reserved 2021
Questions to Answer
● Local Patterns
○ How many X are related to Y?
○ How are X and Y related through multiple hops?
○ What characteristics to Y and Z share in common?
■ Are they the same entity?
MATCH p=(a:Person)-[:HAS_PHONE]->(b:Person) RETURN p
● Global Analytics
○ What are my important entities in my graph?
○ What data is related through its relationships in the graph?
○ Can I predict relationships that don’t explicitly exist?
Neo4j, Inc. All rights reserved 2021
The Neo4j Graph Data Science Library
• Deep path analytics
• Optimal routing
• Evaluates how alike nodes are
• Construct graphs from data
Pathfinding
& Search
Similarity
Community
Detection
Mutable In-Memory
Workspace
Computational Graph
Native Graph Store
50+ Robust Algorithms
Flexible Analytics Workspace
• Identifies node importance
• Influencer & Risk Identification
Centrality /
Importance
• Detects group clustering
• Partition options
• Estimates likelihood of
• Estimate missing information
Link
Prediction
Graph
Embeddings
• Learn your graph topology
• Use for dimensionality reduction
Neo4j, Inc. All rights reserved 2021
7
Supervised ML
Graph-Native
Feature
Engineering
Train
Predictive
Model
Queries
Algorithms
Embeddings
1. Model Type
2. Property
Selection
3. Train & Test
4. Model
Selection
Apply Model to
Existing / New
Data
Store Model in
Database
Use Predictions
for Decisions
Use Predictions
to Enhance
the Graph
Publish & Share
Neo4j, Inc. All rights reserved 2021
1. Ingest Data and Derive Relationships from Unstructured pieces
2. Leverage Graph Analytics to Enhance the Data/Discovery Process
3. Enable Unstructured Search on the Graph
4. Enable Visual Exploration through Neo4j Bloom
5. Tie it all together in a simple Web Application
Example Use-Case
Neo4j, Inc. All rights reserved 2021
9
Custom Application
Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
Demo
Neo4j, Inc. All rights reserved 2021
● Graphs can provide context to information discovery applications.
○ Data relationships are key
○ Both structured and unstructured data
○ Exploration and analytics
○ Focus analysts along paths
○ Graph Data Science algorithms used to enhance the graph.
● Enhanced Document Graphs can form the basis of many discovery and
analytical uses cases.
Takeaways
Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
12
Thank you!
Contact us at
gary.mann@neo4j.com

4. Document Discovery with Graph Data Science

  • 1.
    Neo4j, Inc. Allrights reserved 2021 Neo4j, Inc. All rights reserved 2021 Document Discovery with Graph Data Science Gary Mann gary.mann@neo4j.com
  • 2.
    Neo4j, Inc. Allrights reserved 2021 ● Customers are already very good at search. ○ Too much data, too many tools. ○ Data is highly variable - schemaless. ○ Need to focus analysts. ○ Start with unstructured search - then traverse and discover paths through data. ● Customers want to: ○ Leverage both structured and unstructured data. ○ Support/discover multiple relationships between pieces of information. ○ Understand how entities interact ○ Navigate/Traverse across both structured and unstructured data visually. Graph-Aided Discovery
  • 3.
    Neo4j, Inc. Allrights reserved 2021 Graph-Aided Discovery ● Why Graph? ○ Flexible and (mostly) schemaless model ○ Unstructured queries to enter the graph ○ Queries and pattern matching to traverse relationships. ○ Pathfinding algorithms to understand how entities are related. ○ Graph Analytics to leverage the overall graph structure. ○ Graphs can form the basis for many analytical use-cases including Discovery, Analytics, Investigations, etc.
  • 4.
    Neo4j, Inc. Allrights reserved 2021 Query (e.g. Cypher/Python) Real-time, local decisioning and pattern matching Graph Algorithms Global analysis and iterations You know what you’re looking for and making a decision You’re learning the overall structure of a network, updating data, and predicting Local Patterns Global Computation When Do I need Graph Algorithms?
  • 5.
    Neo4j, Inc. Allrights reserved 2021 Questions to Answer ● Local Patterns ○ How many X are related to Y? ○ How are X and Y related through multiple hops? ○ What characteristics to Y and Z share in common? ■ Are they the same entity? MATCH p=(a:Person)-[:HAS_PHONE]->(b:Person) RETURN p ● Global Analytics ○ What are my important entities in my graph? ○ What data is related through its relationships in the graph? ○ Can I predict relationships that don’t explicitly exist?
  • 6.
    Neo4j, Inc. Allrights reserved 2021 The Neo4j Graph Data Science Library • Deep path analytics • Optimal routing • Evaluates how alike nodes are • Construct graphs from data Pathfinding & Search Similarity Community Detection Mutable In-Memory Workspace Computational Graph Native Graph Store 50+ Robust Algorithms Flexible Analytics Workspace • Identifies node importance • Influencer & Risk Identification Centrality / Importance • Detects group clustering • Partition options • Estimates likelihood of • Estimate missing information Link Prediction Graph Embeddings • Learn your graph topology • Use for dimensionality reduction
  • 7.
    Neo4j, Inc. Allrights reserved 2021 7 Supervised ML Graph-Native Feature Engineering Train Predictive Model Queries Algorithms Embeddings 1. Model Type 2. Property Selection 3. Train & Test 4. Model Selection Apply Model to Existing / New Data Store Model in Database Use Predictions for Decisions Use Predictions to Enhance the Graph Publish & Share
  • 8.
    Neo4j, Inc. Allrights reserved 2021 1. Ingest Data and Derive Relationships from Unstructured pieces 2. Leverage Graph Analytics to Enhance the Data/Discovery Process 3. Enable Unstructured Search on the Graph 4. Enable Visual Exploration through Neo4j Bloom 5. Tie it all together in a simple Web Application Example Use-Case
  • 9.
    Neo4j, Inc. Allrights reserved 2021 9 Custom Application
  • 10.
    Neo4j, Inc. Allrights reserved 2021 Neo4j, Inc. All rights reserved 2021 Demo
  • 11.
    Neo4j, Inc. Allrights reserved 2021 ● Graphs can provide context to information discovery applications. ○ Data relationships are key ○ Both structured and unstructured data ○ Exploration and analytics ○ Focus analysts along paths ○ Graph Data Science algorithms used to enhance the graph. ● Enhanced Document Graphs can form the basis of many discovery and analytical uses cases. Takeaways
  • 12.
    Neo4j, Inc. Allrights reserved 2021 Neo4j, Inc. All rights reserved 2021 12 Thank you! Contact us at gary.mann@neo4j.com