Enterprise data hub connecting 14 sources with intelligent query routing using CAG, RAG, and GraphRAG technologies managed by AI agents based on query complexity.

Python · LangChain · Neo4j · PostgreSQL · Pinecone · Claude · Apache Kafka · FastAPI · Redis · dbt
A mid-sized wholesale distribution company with operations across three countries came to us in late 2024. They had data scattered across 14 different systems: SAP ERP, Salesforce CRM, Stripe billing, warehouse management, shipping carriers, supplier portals, and various spreadsheets. Getting answers to business questions required manual exports, SQL expertise, and hours of cross-referencing data between systems.
We built an intelligent data platform that unifies all their information and lets anyone in the company ask questions in natural language. The system uses three different retrieval architectures, CAG, RAG, and GraphRAG, with AI agents that automatically route queries to the optimal approach based on complexity. A sales rep can ask 'Which customers who usually order monthly haven't ordered in 60 days?' and get an answer in seconds, not hours.
The project was delivered in four stages over six months, building the data foundation first, then implementing progressively sophisticated retrieval methods.
| Stage | Focus Area | Status | Key Deliverables |
|---|---|---|---|
| 1 | Data Infrastructure | Completed | PostgreSQL warehouse, Kafka streaming, CDC pipelines for ERP/CRM/billing, data normalization, identity resolution across systems |
| 2 | CAG Implementation | Completed | Redis caching layer, frequently-accessed data preloading, context window optimization, sub-100ms response for common queries |
| 3 | RAG & GraphRAG | Completed | Pinecone vector store, Neo4j knowledge graph, entity extraction, relationship mapping, semantic search across documents |
| 4 | Agent Orchestration | Completed | Query complexity classifier, intelligent routing system, response synthesis, natural language interface, feedback learning loop |
Before implementing any AI retrieval system, we needed clean, unified data. The company's 14 data sources used different identifiers, naming conventions, and update frequencies. A customer might be 'Acme Industries' in SAP, 'Acme Industries Inc.' in Salesforce, and just an email address in Stripe.
We deployed Apache Kafka for real-time data streaming and implemented Change Data Capture (CDC) on their core systems. When a sales rep updates a customer record in Salesforce, that change flows through Kafka to our PostgreSQL warehouse within seconds. The data transformation layer handles normalization, deduplication, and identity resolution.
The identity resolution system matches records across systems using multiple signals: tax IDs, email domains, phone numbers, addresses, and fuzzy name matching. After initial training on 2,400 manually verified matches, the system now automatically resolves 98% of cross-system records correctly. This unified customer view became the foundation for all three retrieval architectures.
We also extracted and structured unstructured data: customer emails, support tickets, contract PDFs, and order notes. These documents were processed, chunked, and prepared for embedding. The resulting data warehouse contains 47 million structured records and 2.3 million document chunks, all queryable through natural language.
CAG works by preloading relevant data directly into the LLM's context window, eliminating retrieval latency entirely. For frequently asked questions and stable reference data, this approach delivers sub-100ms responses with perfect accuracy since there's no retrieval step that could miss relevant information.
We identified the data that employees query most frequently: product catalogs, pricing tiers, customer credit limits, inventory levels, and standard policies. This reference data changes infrequently but gets queried constantly. We built a Redis caching layer that maintains current snapshots of this information, formatted and ready to inject into the LLM context.
The CAG system handles queries like 'What's the current price for SKU-4521?' or 'What's the credit limit for Acme Industries?' by loading the relevant data slice into Claude's context window along with the question. Response time averages 80ms because there's no vector search or database query during inference.
We implemented smart cache invalidation that updates the preloaded data when source systems change. When someone updates a price in SAP, the corresponding cache entry refreshes within seconds. The system also tracks which cached data gets used most frequently and optimizes context window allocation accordingly.
Standard RAG handles queries that need to search across large document collections. We embedded all customer communications, contracts, support tickets, and internal notes using OpenAI's embedding model and stored them in Pinecone. When someone asks 'What did we discuss with Acme about the delayed shipment last month?', the system retrieves the most semantically relevant documents and synthesizes an answer.
But RAG struggles with questions that require understanding relationships between entities. 'Which suppliers provide components for products that Acme orders regularly?' requires traversing connections between customers, orders, products, components, and suppliers. This is where GraphRAG becomes essential.
We built a Neo4j knowledge graph that models the relationships in their business: customers connect to orders, orders contain products, products require components, components come from suppliers, suppliers have contracts, and so on. The graph contains 12 million nodes and 89 million relationships extracted from the unified data warehouse.
GraphRAG queries first traverse the knowledge graph to identify relevant entities and their connections, then uses those structured relationships to ground the LLM's response. For complex analytical questions, this produces dramatically more accurate answers than vector similarity search alone. The graph also enables 'reasoning' queries like 'If Supplier X can't deliver, which customers would be affected?' that pure RAG cannot handle.
The real innovation is the routing layer that decides which retrieval method to use. We built a query classifier agent that analyzes incoming questions and routes them optimally. Simple factual lookups go to CAG. Semantic searches across documents go to RAG. Complex relational queries go to GraphRAG. Some queries use multiple methods in combination.
The classifier considers several signals: query structure, entity types mentioned, whether the question involves relationships or comparisons, temporal aspects, and historical patterns of similar queries. It's not a simple keyword match, the agent understands intent. 'Tell me about Acme' routes to CAG for basic company info, but 'Tell me everything about our relationship with Acme' triggers GraphRAG to explore all connected entities.
We also implemented a synthesis agent that combines results when queries span multiple retrieval methods. A question like 'Summarize our top 10 customers' recent issues and which suppliers are involved' might pull customer rankings from CAG, support tickets from RAG, and supplier relationships from GraphRAG, then synthesize a coherent answer.
The system includes a feedback loop where users can rate response quality. Low-rated responses get logged for review, and patterns in failures inform improvements to the routing logic. Over three months, the routing accuracy improved from 84% to 96% through this continuous learning process.
Average query response time is 1.8 seconds for complex questions that previously required hours of manual research. Simple lookups through CAG respond in under 100 milliseconds. The natural language interface means anyone can query the system without SQL knowledge or understanding which source system holds the data.
The finance team's monthly close process reduced from 8 days to 2 days. Previously, reconciling data across systems required extensive manual work. Now they ask questions like 'Show me all invoices that don't match their corresponding orders' and get instant answers with links to the specific discrepancies.
Sales discovered $340,000 in at-risk revenue within the first month. Queries that were impossible before, like identifying customers whose order frequency dropped year-over-year while their industry peers increased orders, now take seconds. The GraphRAG architecture made competitive analysis possible by connecting customer behavior to market segment patterns.
The platform processes an average of 2,400 natural language queries per day across 180 active users. The three-tier retrieval architecture means compute costs stay manageable: 71% of queries resolve through CAG (cheapest), 22% through RAG (moderate), and only 7% require full GraphRAG traversal (most expensive). The intelligent routing optimizes both response quality and infrastructure costs.
If your team is one or two unknowns away from a system like this one, a thirty-minute call is the fastest way to find out.
Book a discovery callEngagements range from two-week diagnostics to multi-month builds, scoped after a single discovery call.
Every project on this page shipped because we said no to the wrong scope before we said yes to the right one. Half the value of working with us is the engagement we will not take. The other half is the system that ends up running in your business.
Healthcare, defense-adjacent, and enterprise clients sign NDAs that prevent naming. Engagement scope, technology stack, and measured outcomes can be shared publicly. Client identity stays protected.