Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

1.3.1. Retrieval-Augmented Generation (RAG)

💡 First Principle: RAG extends an FM's knowledge beyond its training data by retrieving relevant documents at query time and injecting them into the context window — the FM's job shifts from "recall from training" to "reason over provided context." This is the most reliable way to make FMs accurate about domain-specific or time-sensitive information.

The complete RAG pipeline:
Where RAG breaks — and what to fix:
Failure ModeSymptomFix
Retrieval missFM says "I don't have that info" but the info is indexedImprove chunking, query expansion, or embedding model
Context dilutionFM ignores relevant retrieved chunksReduce chunk size, improve reranking
Semantic driftAnswers degrade over timeMonitor retrieval scores; trigger reindexing
FM ignores contextFM answers from training despite retrieved docsStrengthen grounding instructions in system prompt

Amazon Bedrock Knowledge Bases is the managed RAG service — it handles indexing, chunking, embedding, and retrieval automatically. You configure the data source (S3), choose the embedding model, choose the vector store backend, and Bedrock manages the rest. For production deployments where you want full pipeline control, a custom OpenSearch implementation gives more flexibility at the cost of operational complexity.

⚠️ Exam Trap: Bedrock Knowledge Bases and a custom OpenSearch vector store are not architecturally equivalent even though both use vector search. Bedrock Knowledge Bases is a fully managed pipeline — ideal for standard RAG. Custom OpenSearch is for cases requiring multi-index strategies, custom reranking, or specialized query pre-processing that Bedrock Knowledge Bases doesn't support natively.

Reflection Question: A financial services company needs to build a Q&A bot over 50,000 internal policy documents. Their retrieval quality drops for queries about regulations published in the last 6 months. What architectural component is most likely failing, and how would you fix it?

Alvin Varughese
Written byAlvin Varughese
Founder15 professional certifications