You have built a RAG system. It works 80% of the time. It answers "What is Product Alpha?" perfectly.
But then your CEO asks: "Which of our suppliers for Project Beta are also potential competitors in the EMEA market?"
Your vector database fails. Not just a little bit—it completely hallucinates. It retrieves a document about "Project Beta" and a document about "EMEA," but it misses the hidden connection that ties them together.
This is the limit of Vector RAG. Vectors measure similarity. Knowledge Graphs measure connectedness. To build a true reasoning engine, you need both.
1. The "Two-Hop" Curse
Research from Microsoft and others describes this as the "Two-Hop Curse."
Imagine three facts in three different PDF documents:
- Fact A: "Acme Corp supplies lithium batteries."
- Fact B: "Lithium batteries are classified as Class 9 Hazardous Materials."
- Fact C: "Class 9 materials require Permit X-99 to ship."
Query: "What shipping permit do I need for Acme Corp products?"
Vector Search will fail here. The query "Acme Corp" matches Fact A. The query "shipping permit" matches Fact C. But Fact B—the bridge—has low semantic similarity to the query. It is never retrieved. The chain is broken.
2. GraphRAG: Restoring the Context
GraphRAG (Retrieval Augmented Generation with Graphs) solves this by pre-calculating the connections before query time.
We use an LLM during the ingestion phase to extract Triples (Subject -> Predicate -> Object).
(Lithium Batteries) -[:IS_A]-> (Class 9 Hazmat)
(Class 9 Hazmat) -[:REQUIRES]-> (Permit X-99)
Now, the database knows that Acme Corp is only 2 hops away from Permit X-99. It doesn't need semantic similarity; it follows the path.
3. Architecture: The Graph Sidecar
We implemented a "Graph Sidecar" architecture for a Logistics Client. We didn't replace their Vector DB (Pinecone); we augmented it with Neo4j.
The Extraction Pipeline
// Cypher Query: The "Multi-Hop" Retrieval
// "Find all shipping requirements for Supplier X's products"
MATCH path = (s:Supplier {name: "Acme Corp"})-[:SUPPLIES]->(p:Product)-[:IS_A]->(cat:Category)-[:REQUIRES]->(req:Requirement)
RETURN s.name, p.name, req.code, req.description
LIMIT 5;
// Output:
// Acme Corp | Li-Ion Battery | Permit X-99 | "Must be stored at < 20C"
When to use Graph vs Vector?
This is the most common question we get. The answer is simple:
- Use Vectors for Breadth: "Find me documents about diversity policies." (Unstructured search)
- Use Graphs for Depth: "How does the diversity policy affect the hiring manager approved list?" (Structured Logic)
4. The "Entity Resolution" Superpower
Another massive advantage of Graphs is Entity Resolution.
In Document A, he is "Mr. Smith." In Document B, he is "John Smith." In Document C, he is "J. Smith, VP of Sales."
A Vector DB treats these as three different strings. A Knowledge Graph unifies them into a single Node: `(Person: John Smith)` with aliases. When you ask about "J. Smith," the Graph retrieves all knowledge about "Mr. Smith" automatically.
Stop Guessing. Start Connecting.
We architect high-performance "Hybrid RAG" systems connecting Neo4j and Vector Stores.
Design My Knowledge Graph