2.1.2. The Bedrock Model Catalog: Claude, Titan, Llama, and Beyond
💡 First Principle: The Bedrock model catalog is organized by provider and capability tier — understanding which providers offer which capabilities, and which models within each provider suit which task types, is prerequisite to any selection decision.
Primary model families on Amazon Bedrock (AIP-C01 relevant):
| Provider | Model Family | Strengths | Context Window | Best For |
|---|---|---|---|---|
| Anthropic | Claude 3 Haiku | Fast, low cost | 200K tokens | High-volume simple tasks, classification |
| Anthropic | Claude 3 Sonnet | Balanced | 200K tokens | General-purpose reasoning, code |
| Anthropic | Claude 3 Opus | Highest capability | 200K tokens | Complex reasoning, long docs |
| Amazon | Titan Text Lite | Cost-optimized | 4K tokens | Simple tasks, tight budgets |
| Amazon | Titan Text Premier | Balanced | 32K tokens | General enterprise use |
| Amazon | Titan Embeddings v2 | 1024-dim embeddings | — | RAG, semantic search |
| Amazon | Titan Multimodal | Image + text | — | Product catalog, image Q&A |
| Meta | Llama 3 (various) | Open weights, customizable | 8K–128K | When open-source licensing needed |
| Mistral AI | Mistral/Mixtral | Efficient, multilingual | 32K tokens | European data residency requirements |
| Stability AI | Stable Diffusion XL | Image generation | — | Creative content, product images |
Cross-Region Inference: When a model is not available in your required AWS region (common for new model releases), Bedrock's cross-region inference automatically routes your request to the nearest region where the model is available. This is transparent to your application — you use the same API call with a cross-region inference profile ARN.
# Cross-region inference profile — handles routing automatically
response = bedrock_runtime.invoke_model(
modelId='arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0',
# If us-east-1 lacks capacity, Bedrock routes to us-west-2 or eu-west-1
body=json.dumps({'messages': [...], 'max_tokens': 1000})
)
⚠️ Exam Trap: Cross-region inference means your data may leave your primary AWS region to be processed in another region. For workloads with strict data residency requirements (GDPR, HIPAA data that must stay in eu-west-1), cross-region inference must be disabled or configured with region constraints. The exam specifically tests this data residency conflict.
Reflection Question: A European healthcare company processes patient data and wants to use a foundation model available only in us-east-1. They have strict GDPR data residency requirements preventing patient data from leaving the EU. What is the correct architectural approach?