1.3.1. Retrieval-Augmented Generation (RAG)
💡 First Principle: RAG extends an FM's knowledge beyond its training data by retrieving relevant documents at query time and injecting them into the context window — the FM's job shifts from "recall from training" to "reason over provided context." This is the most reliable way to make FMs accurate about domain-specific or time-sensitive information.
The complete RAG pipeline:
Where RAG breaks — and what to fix:
| Failure Mode | Symptom | Fix |
|---|---|---|
| Retrieval miss | FM says "I don't have that info" but the info is indexed | Improve chunking, query expansion, or embedding model |
| Context dilution | FM ignores relevant retrieved chunks | Reduce chunk size, improve reranking |
| Semantic drift | Answers degrade over time | Monitor retrieval scores; trigger reindexing |
| FM ignores context | FM answers from training despite retrieved docs | Strengthen grounding instructions in system prompt |
Amazon Bedrock Knowledge Bases is the managed RAG service — it handles indexing, chunking, embedding, and retrieval automatically. You configure the data source (S3), choose the embedding model, choose the vector store backend, and Bedrock manages the rest. For production deployments where you want full pipeline control, a custom OpenSearch implementation gives more flexibility at the cost of operational complexity.
⚠️ Exam Trap: Bedrock Knowledge Bases and a custom OpenSearch vector store are not architecturally equivalent even though both use vector search. Bedrock Knowledge Bases is a fully managed pipeline — ideal for standard RAG. Custom OpenSearch is for cases requiring multi-index strategies, custom reranking, or specialized query pre-processing that Bedrock Knowledge Bases doesn't support natively.
Reflection Question: A financial services company needs to build a Q&A bot over 50,000 internal policy documents. Their retrieval quality drops for queries about regulations published in the last 6 months. What architectural component is most likely failing, and how would you fix it?