LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium
11/02/25, 16:20
Search Write
LLM Knowledge Graph Builder —
First Release of 2025
New features include community summaries, parallel retrievers,
and expanded model support for better knowledge graph
construction from text
Michael Hunger · Follow
Published in Neo4j Developer Blog · 9 min read · 5 days ago
160 2
Many developers try to build retrieval-augmented generation (RAG)
experiences to interact with information from unstructured data using
only vector search and struggle to get to the results that they want.
Looking only at text fragments without context only gets you so far. As
usual in data engineering, there are more advanced patterns for
preprocessing the data and extracting knowledge, one of which is
GraphRAG. So when you get around using the data, you’ve surfaced the
underlying concepts and can make use of them to connect the pieces and
provide relevant context to a user’s questions.
Introduction
We built, open-sourced, and hosted the LLM Knowledge Graph Builder to
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 1 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
let you try out better ways of treating your unstructured data. We
preprocess documents, transcripts, web articles, and more sources into
chunks, compute text embeddings, and connect them (lexical graph).
But we don’t stop there. We also extract entities and their relationships,
which is especially relevant if you ingest multiple documents because you
can relate the pieces spread out over multiple sources (entity graph).
This combined knowledge graph then enables a set of different retrievers
to fetch data (see below).
Since we launched the LLM Knowledge Graph Builder in June 2024, we’ve
had an impressive amount of usage and great feedback from users. It’s
now the fourth most popular source of user interaction on AuraDB Free,
which makes us really happy.
We did a release in fall 2024, but there were too many AI events, which
took most of my time to write a blog post. Over the past few months, the
team worked on really nice features — some of which we want to
introduce today in the first release of 2025.
What Does the LLM Knowledge Graph Builder Do?
For those of you who don’t know what the tool does, here’s a quick
introduction.
If you have a number of text documents, web articles, Wikipedia pages,
or similar unstructured information, wouldn’t it be great to surface all the
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 2 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
knowledge hidden inside those in a structured way and then use those
entities and their relationships to better chat with your data?
The LLM Knowledge Graph Builder:
1. Imports your documents
2. Splits them into chunks and links them up
3. Generates text embeddings for vector search and connects the most
similar ones
4. Uses a variety of large language models (LLMs) to extract entities and
their relationships
5. Optionally using a graph schema you can provide
6. Stores the nodes and relationships in Neo4j
7. And when running against a graph data science-enabled Neo4j
instance, it also performs topic clustering and summarization
Get a quick overview of the process and try it out at https://llm-graph-
builder.neo4jlabs.com.
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 3 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Neo4jconnection
@neo4j+s://91183536.databases.neo4.10:/68/
Drag&Drop
Orbrowse
InspectGeneratedGraphfromGoogle_DeepMin
•Wearevisualizing50chunksatatime
Documents,Images,Unstructured
•Document&Chunk Entities Communities
WebSources
The only prerequisite is a publicly accessible Neo4j instance to store your
data, which you can create on AuraDB Free (or Aura Pro Trial with Graph
AmazonS3
Data Science).
GCS
Neo4j's LLM Knowledge Graph Builder - DEMO
IMModes
Openaigpt...
GenerateGraph Preview
LLMModelusedforExtraction&Chat
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 4 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Under the Hood Blog Series
Over the next few weeks, we have a series of blog posts that explore
different aspects of the LLM Knowledge Graph Builder and explain how
they work under the hood, so you can benefit from our learnings and
apply them to your own GenAI projects.
New Features
Let’s dive into the new features. The main ones are generating
Community Summaries and new local and global retrievers for it, as well
as the new ability of running multiple retrievers in parallel for your
questions and evaluating them. You can now also guide the extraction
with custom prompt instructions.
Additionally, we have some user experience improvements to highlight.
More Models
In development mode or in self-deployment, we tested and configured
the LLM Knowledge Graph Builder with a plethora of new models, which
we also hint at in the production version:
OpenAI GPT-4o (and GPT-4o mini)
Google Gemini 1.5 and 2.0 Pro and Flash
Qwen 2.5
Amazon Nova models
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 5 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Groq
Llama 3.x
Ollama models
Claude 3.5 Sonnet
DeepSeek and Microsoft Phi-4, coming soon
We also test the models internally for integration testing and to see how
well they work for extraction.
Test results for different LLM extractions with entity counts and runtimes
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 6 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Community Summaries
One way you can improve the richness of the graph representing your
documents is by running graph algorithms to extract additional
information hidden in the graph structure.
Microsoft did that last year with the “From local to global — Query
Focused Summarization GraphRAG” paper. They used a hierarchical
graph clustering algorithm (Leiden) on the extracted entity domain
graph. This algorithm identifies clusters of closely related entities. An
LLM then summarizes their content into community summary nodes,
which represent the information in that set of entities and relationships.
Due to the hierarchical results, this can be done on multiple levels going
from very fine-grained to the highest level.
A more detailed blog post on evaluation will be available later in the series.
The extracted communities are visible in the graph visualization of your
documents, so you can inspect them and their texts, and see which
entities they are summarizing.
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 7 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
InspectGeneratedGraphfromGoogle_DeepMind.pdf
•Wearevisualizing50chunksatatime
Document&Chunk Entities VCommunities
Graph visualization with entities and summarized communities
Those community summaries are then used in a global retriever to
answer general questions about the documents, which aren’t specific to
any one entity but identify topics across documents. So instead of the
vertical sequence of content of each document, they represent themes
spanning individual documents.
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 8 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
TopicCluster
TopicCluster
TopicCluster
TopicCluster
Clustering topics across documents with graph algorithms
In LLM Knowledge Graph Builder, we implemented it using the same
clustering algorithm, so if you run connect to a Neo4j instance that has
graph data science enabled (AuraPro with GDS, AuraDS, Neo4j Sandbox,
or self-hosted — which we show as ⚛ on top of the app), and enable the
Community Summarization in the Graph Enhancements > Post
Processing, we can run the algorithm and create the tree of summaries.
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 9 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Post Processing Jobs
In our global community retriever, we took a slightly different approach
from the paper, which just stuffs all the community summaries (of a
certain level) into several LLM prompts to answer a question. We
generate vector embeddings of the community summaries and use
similarity and full-text search to find the most relevant ones for the
question, which Microsoft Research recently also proposed, and use
those to answer the questions.
Since we can now run multiple retrievers in parallel, you can see the
difference for yourself. Check out the global and local entity retrievers
comparison.
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 10 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Global Community Retriever
Both retrievers, like all the others, allow showing the retrieved contextual
graph data (communities, entities, chunks) that went into generating the
answer, supporting explainability.
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 11 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Global Community Retriever retrieval details
Local Entity Retriever
The local entity retriever turns what is several thousand lines of Python
code in the Microsoft paper implementation in about 50 lines of Cypher
(one benefit of using an actual graph database), fetching the following:
Entities with hybrid search (embedding and full-text)
The relationships between them
The most common relationships to entities outside the initial set
The chunks and documents from where the entities were extracted
The community summaries the entities are part of
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 12 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
// previous hybrid search on entities, then graph expansion
WITH collect(node) AS nodes,
avg(score) AS score,
collect({id: elementId(node), score: score}) AS metadata
RETURN score, nodes, metadata,
collect {
UNWIND nodes AS n
MATCH (n)<-[:HAS_ENTITY]->(c:Chunk)
WITH c, count(distinct n) AS freq
RETURN c
ORDER BY freq DESC
LIMIT 3
} AS chunks,
collect {
UNWIND nodes AS n
OPTIONAL MATCH (n)-[:IN_COMMUNITY]->(c:__Community__)
WITH c, c.community_rank AS rank, c.weight AS weight
RETURN c
ORDER BY rank, weight DESC
LIMIT 3
} AS communities,
collect {
UNWIND nodes AS n
UNWIND nodes AS m
MATCH (n)-[r]->(m)
RETURN DISTINCT r
} AS rels,
collect {
UNWIND nodes AS n
MATCH path = (n)-[r]-(m:__Entity__)
WHERE NOT m IN nodes
WITH m, collect(distinct r) AS rels, count(*) AS freq
ORDER BY freq DESC
LIMIT 10
WITH collect(m) AS outsideNodes, apoc.coll.flatten(collect(rels)) AS rels
RETURN { nodes: outsideNodes, rels: rels }
} AS outside
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 13 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Local entity retriever
For the entity retriever, we can show not only text chunks but also the
entities and their retrieved relationships.
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 14 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Local entity retriever result details
Community summaries used in the local entity retriever
Multi-Retrievers
As shown in the previous section, you can now select one or more
retrievers run in parallel to generate an answer to your question and
switch between the answer results directly to compare.
completely
Vector
Utilizesvectorindexingontextchunksto
enablesemanticsimilaritysearch.
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 15 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Fulltext
Employsafulltextindexontextchunksfor
rapidkeyword-basedsearch,efficiently
identifyingdocumentscontainingspecific
wordsorphrases.
Graph+Vector+Fulltext
Mergesvectorindexing,graphconnections,
andfulltextindexingforacomprehensive
searchapproach,combiningsemantic
similarity,contextualrelevance,andkeyword-
basedsearchforoptimalresults.
Entitysearch+Vector
Combinesentitynodevectorindexingwith
graphconnectionsforaccurateentity-based
search,providingthemostrelevantresponse.
•Selected
Globalsearch+Vector+Fulltext
Usevectorandfull-textindexingon
communitynodestoprovideaccurate,
context-awareanswersglobally.
•Selected
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 16 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
In the Details link after each answer, the retriever also provides the
contextual facts retrieved from the database, which are then sent to the
LLM, as well as additional information about the model, runtime, and
token counts.
In development mode or for self-hosted, even more retrievers are
available for you to test and compare.
To make this easier, the narrow right conversation sidebar can be
maximized and even popped out into a full frame, which even becomes
shareable. This is especially useful with read-only database connection
setups, which we now also support for sharing your generated knowledge
graph.
The underlying data of your conversation can be downloaded as a JSON
file, so you can process it according to your own needs.
Retriever Evaluation
One reason for the parallel retriever evaluation was adding the ability to
generate evaluation metrics.
We are using the RAGAs framework to run the evaluation. Currently, we
compute the following metrics, some of which you need to provide a
ground truth for:
Relevancy — How well the answer addresses the user’s question
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 17 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Faithfulness — How accurately the answer reflects the provided
information
Context relevance — Determines the recall of entities present in
generated answer and retrieved contexts
Semantic relevance — How well the generated answer understands
the meaning of the reference answer
ROUGE — Similarity to ground-truth answer word by word
Retriever evaluation
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 18 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
There will be a more detailed blog post on evaluation later in the series.
Guided Instructions for Extraction
In the latest version, we added the ability to guide the extraction more by
allowing users to pass additional prompts to the LLM for extracting
entities. So you can force it to only focus on certain parts of the
document, on certain themes, or use specific additional instructions.
Provide additional extraction instructions
I tested it by extracting entities and relationships from several articles on
Albert Einstein’s work but directing the LLM not to extract anything
related to his physics work. And there it was — life, people, awards, peace
activities, and other inventions, but not much on his tremendous
contributions in physics.
User Experience Improvements
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 19 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
A quick list of the improvements:
Allow read-only database access, then only retrieval is possible
Pop out the chat experience into a separate window
Improve graph visualization with local search and highlighting
Graph Consolidation
An experimental feature is automatic graph consolidation, intended for
users who just quickly want to see an extracted knowledge graph from
their data but don’t want to specify a graph schema upfront.
In those cases, the LLM often generates a load of entity types and
relationships — counting into the thousands if you give it free rein. Our
retrievers don’t mind because they use the graph topology, not the actual
type, to traverse the graph (although they collect them together with the
textual information).
That’s why we recommend providing a graph schema upfront for a more
semantically constrained knowledge graph. But in cases where this didn’t
happen, we can use an LLM to categorize a sorted list of node labels and
relationship types into a smaller, more general set. Since we’re not 100-
percent happy with the reduction, we didn’t enable it by default, but we
would love your feedback. You can find it as one of the post-processing
jobs in the graph enhancements.
Summary
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 20 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Building an open source tool like this is a fulfilling experience —
especially when we get as much feedback as we’ve gotten so far. We’ve
addressed more than 400 GitHub issues, including internal planned tasks,
and gotten more than 2,800 GitHub stars.
If you haven’t yet, please try it out and let us know what you think in the
comments. We’d also be excited if you wrote about your experiences
using the tool for different sets of documents of different domains.
Please share if you have questions or feedback. Also, please give us a star
if you like the project.
Happy building!
Graphrag Knowledge Graph Retrieval Augmented Gen Genai
Developer Experience
Published in Neo4j Developer Blog
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 21 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Published in Neo4j Developer Blog Follow
12.4K Followers · Last published 19 hours ago
Developer Content around Graph Databases, Neo4j, Cypher, Data Science,
Graph Analytics, GraphQL and more.
Written by Michael Hunger Follow
2.6K Followers · 97 Following
A software developer passionate about teaching and learning. Currently working
with Neo4j, GraphQL, Kotlin, ML/AI, Micronaut, Spring, Kafka, and more.
Responses (2)
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 22 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Responses (2)
What are your thoughts? Respond
Vishwajeet
2 hours ago
Hi, Thanks for the great article. I have also seen a great article on
https://www.mraiengineer.com/2025/02/reinforce-simple-and-efficient-approach.html
I think it will be helpful. Thanks!
Reply
Matt
17 hours ago
Sounds interesting! Any plans to support FalkorDB?
Reply
More from Michael Hunger and Neo4j Developer Blog
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 23 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Michael Hunger In Neo4j Developer Blog by Tomaz Bratanic
5 Tips for Embedding Tables in Implementing an Automated
Your Medium Posts Report-Generation Agent
Data Publishing in the “Modern” Age Build dynamic data-driven reports
automatically using crewAI and Neo4j
Sep 2, 2017 5.5K 47 Jan 30 209 1
In Neo4j Developer Blog by Tomaz Bratanic In Neo4j Developer Blog by Michael Hunger
Building Knowledge Graph Claude Conversing With Neo4j
Agents With LLamaIndex… Via MCP
Workflows
A blueprint for building Text2Cypher Give LLMs the Power of Knowledge Graphs
agentic interfaces With Anthropic’s Model Context Protocol
Jan 17 320 Dec 20, 2024 92 1
See all from Michael Hunger See all from Neo4j Developer Blog
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 24 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Recommended from Medium
In AI Advances by Kennedy Selvadurai, PhD In Level Up Coding by Cristian Leo
Knowledge Graph Construction Don’t Do RAG: Cache is the future
and Querying with Deepseek-R1… CAG or RAG? Let’s explore Cached
7B
Will vs Mistral 7B
Deepseek-R1 onofNeo4j
chain thoughts Augmented Generation, its math, and…
approach generate meaningful graphs an… trade-offs. Let’s dig into its research paper
lead to end of hallucinations? Not quite. to see what it excels…
5d ago 282 4 6d ago 383 2
Lists
data science and AI Natural Language
40 stories · 326 saves Processing
1928 stories · 1582 saves
Productivity
244 stories · 674 saves
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 25 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Finn Andersen Cobus Greyling
LangChain vs PydanticAI for Meta Prompting: A Practical
building an AI Agent Guide to Optimising Prompts…
Comparing how the two frameworks can be Automatically
Discover how meta prompting can enhance
used to build a conversational AI agent your results by using advanced models to…
optimise prompts for simpler ones.
Jan 29 181 2 5d ago 218
Sonam Shrish Magar In Artificial Intelligence in Plain … by Austin St…
nglish ks
DeepSeek R1 in 24GB GPU : OpenAI just quietly released
Dynamic Quantization by Unslot… another agentic framework. It’s…
AI
Thefor a 671B-Parameter
original Model
DeepSeek R1 is a 671-billion- really
All of myfucking cool
articles are 100% free to read.
parameter language model that has been… Non-members can read for free by clickin…
dynamically quantized by the team at my friend link here.
Unsloth
Feb 2 AI…
302 1 6d ago 1K 27
See more recommendations
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 26 di 27
LLM Knowledge Graph Builder — First Release of 2025 | by Michael Hunger | Neo4j Developer Blog | Feb, 2025 | Medium 11/02/25, 16:20
Help Status About Careers Press Blog Privacy Terms Text to speech Teams
https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76 Pagina 27 di 27