Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.
8.2. High-Frequency Exam Topics Quick Reference
Service Selection Decision Trees
When to use Bedrock vs. SageMaker:
- Invoke a managed FM via API → Bedrock
- Fine-tune a model with your own data → Bedrock (for supported models) or SageMaker (full control)
- Deploy your own custom model → SageMaker endpoint
- Need custom training loop → SageMaker Training Job
When to use Bedrock Knowledge Bases vs. custom OpenSearch:
- Standard RAG, managed pipeline, no custom preprocessing → Bedrock Knowledge Bases
- Real-time document availability (< 1 minute from upload to query) → Custom OpenSearch
- Complex metadata filtering with custom schema → Custom OpenSearch
- Need full control over chunking algorithm → Custom OpenSearch
When to use Bedrock Agents vs. Strands Agents vs. Step Functions:
- Multi-step task with dynamic tool selection, managed service → Bedrock Agents
- Code-first agent with full programmatic control → Strands Agents
- Deterministic workflow with many AWS service integrations → Step Functions
- Long-running (hours+) process with durable execution → Step Functions Standard Workflow
When to fine-tune vs. use RAG:
- Need the model to know domain facts → RAG
- Need consistent output format or style → Fine-tuning
- Need to reduce token cost by shortening prompts → Fine-tuning
- Data changes frequently → RAG (re-index) not fine-tuning (retrain)
Critical Number Pairs
| Topic | Value A | Value B |
|---|---|---|
| API Gateway max timeout | 29 seconds | Lambda max timeout: 15 minutes |
| Provisioned throughput break-even | ~60–70% utilization | Below this: on-demand cheaper |
| HNSW default ef_search | 512 (high recall) | Lower value = faster, worse recall |
| Bedrock Knowledge Bases sync | Minutes–hours delay | Custom OpenSearch: near real-time |
| Transfer question target (Associate) | 15–25% of questions | Foundation: 5–10%; Expert: 25–30% |
| Semantic cache similarity threshold | 0.92 typical | Below this → too many false positives |
The Safety Control Stack
Layer 1 (Earliest): Input sanitization — Comprehend PII detection + custom regex
Layer 2: Bedrock Guardrails (input evaluation) — topic denial, word filters, PII
Layer 3: FM inference — model's own RLHF safety training
Layer 4: Bedrock Guardrails (output evaluation) — content filters, grounding check
Layer 5: Lambda output validation — schema validation, semantic grounding check
Layer 6: Human review — for high-stakes irreversible decisions (waitForTaskToken)
No single layer is sufficient. Defense-in-depth requires all layers.
The Monitoring Matrix
| What You're Monitoring | Service | Specific Metric |
|---|---|---|
| Who changed Guardrails config | CloudTrail | UpdateGuardrail event |
| What prompt was sent | Bedrock Model Invocation Logs | Request body |
| FM latency | CloudWatch (native Bedrock) | InvocationLatency |
| Application E2E latency | X-Ray + custom CloudWatch metric | Custom TotalLatencyMs |
| Response quality | Custom CloudWatch metric | GroundingScore, FaithfulnessScore |
| Cost by team | Cost Explorer + resource tags | Filtered by team/project tag |
| Knowledge Base freshness | Custom Lambda + CloudWatch | DaysSinceLastSync |
MCP at a Glance
- What: Open protocol (not AWS-proprietary) for standardized AI tool connectivity
- Who made it: Anthropic
- Who uses it: Bedrock Agents, Strands Agents, Claude Desktop, and any MCP client
- Servers: Stateless Lambda (lightweight) or stateful ECS (complex tools)
- Key benefit: N clients + M servers = N+M integrations (not N×M)
- Exam signal: "Standardized tool integration across frameworks" → MCP
Written byAlvin Varughese
Founder•15 professional certifications