Topic Cluster

RAG & Vector Search

Master retrieval-augmented generation, vector databases, embeddings, and semantic search systems.

29 articles in this topic

Jun 11, 2026

DeepEval vs RAGAS vs TruLens: Pick Your RAG Eval Stack

RAGAS for fast experiments, DeepEval for CI gates, TruLens for production tracing. The metric-by-metric comparison plus the 2026 production thresholds to set.

$Vector Search at a Billion Vectors: The Cost-Per-QPS Math$

May 29, 2026

Vector Search at a Billion Vectors: The Cost-Per-QPS Math

Recall converges at 95-99% across HNSW engines, so cost at scale is throughput-per-dollar. ScyllaDB hits 252K QPS at 2ms P99 on 1B vectors. Here's the math.

May 22, 2026

Reranker Models Compared: Cohere vs Voyage vs Jina vs BGE

Jina Reranker v3 hits 81.33% Hit@1 at 188ms, the only top-tier sub-200ms model. The latency and NDCG breakdown across Cohere, Voyage, Jina, and BGE.

May 18, 2026

Document Parsing for RAG: Reducto vs LlamaParse vs Docling

Bad extraction poisons every downstream embedding. The honest breakdown of Reducto, LlamaParse, Unstructured, and Docling on tables, compliance, and price.

May 5, 2026

PageIndex: Vectorless RAG That Hits 98.7% on FinanceBench

PageIndex hits 98.7% on FinanceBench with no embeddings and no vector DB. Here's how the LLM-reasoned TOC tree works, where it breaks, and when to migrate.

Apr 30, 2026

Turbopuffer vs Pinecone in 2026: Why Cursor and Notion Migrated

Cursor, Notion, and Linear standardized on Turbopuffer for one reason: object storage cuts vector DB cost 10x at scale. Here's the migration playbook, and when Pinecone still wins.

Apr 23, 2026

Your 1M Context Window Is Lying: What Chroma's Context Rot Study Proves

Chroma tested 18 frontier models across long contexts. All of them degraded, 30%+ accuracy drops when the answer sits mid-document, 7.9% loss from length alone. Here's the cap and the compaction loop we ship.

Apr 16, 2026

Karpathy's LLM Wiki Pattern: When Compiled Knowledge Beats RAG

Andrej Karpathy's LLM Wiki compiles raw sources into a maintained knowledge base before queries ever arrive: eliminating embedding drift, chunk-boundary errors, and retrieval noise. Here's when it beats RAG and when it doesn't.

Apr 1, 2026

Agentic RAG Explained: How Agent-Controlled Retrieval Beats Fixed Pipelines

Traditional RAG pipelines hit 34% accuracy on complex queries. Agentic RAG's agent-controlled retrieval loop, with routing, grading, and self-correction, pushes that to 78%. Here's the architecture and how to build it.

RAG & Vector Search

DeepEval vs RAGAS vs TruLens: Pick Your RAG Eval Stack

Vector Search at a Billion Vectors: The Cost-Per-QPS Math

Reranker Models Compared: Cohere vs Voyage vs Jina vs BGE

Document Parsing for RAG: Reducto vs LlamaParse vs Docling

PageIndex: Vectorless RAG That Hits 98.7% on FinanceBench

Turbopuffer vs Pinecone in 2026: Why Cursor and Notion Migrated

Your 1M Context Window Is Lying: What Chroma's Context Rot Study Proves

Karpathy's LLM Wiki Pattern: When Compiled Knowledge Beats RAG

Agentic RAG Explained: How Agent-Controlled Retrieval Beats Fixed Pipelines

Explore Other Topics

AI Agents

AI Security

LLMs & Models

AI for Business

AI Development Tools

RAG & Vector Search

DeepEval vs RAGAS vs TruLens: Pick Your RAG Eval Stack

Vector Search at a Billion Vectors: The Cost-Per-QPS Math

Reranker Models Compared: Cohere vs Voyage vs Jina vs BGE

Document Parsing for RAG: Reducto vs LlamaParse vs Docling

PageIndex: Vectorless RAG That Hits 98.7% on FinanceBench

Turbopuffer vs Pinecone in 2026: Why Cursor and Notion Migrated

Your 1M Context Window Is Lying: What Chroma's Context Rot Study Proves

Karpathy's LLM Wiki Pattern: When Compiled Knowledge Beats RAG

Agentic RAG Explained: How Agent-Controlled Retrieval Beats Fixed Pipelines

Explore Other Topics

AI Agents

AI Security

LLMs & Models

AI for Business

AI Development Tools