5.3.3. Encryption, KMS, and Data Protection
💡 First Principle: Encryption protects data by making it unreadable without the correct key. AWS provides encryption at rest (stored data) and in transit (moving data), and the key question isn't whether to encrypt—it's who controls the keys. The answer determines your compliance posture and the exam tests this distinction.
AWS offers three key management options:
AWS-managed keys (SSE-S3 for S3, default SageMaker encryption): AWS generates, stores, and rotates keys automatically. Simplest option, sufficient for many use cases, but you have limited control over key policies.
Customer-managed keys (CMK) via AWS KMS: You create keys in KMS, define who can use them through key policies, and control rotation schedules. SageMaker integrates with KMS for encrypting training data volumes, model artifacts in S3, and endpoint storage. This is the standard answer for exam questions requiring "customer-controlled encryption."
Customer-provided keys (SSE-C): You generate and manage keys entirely outside AWS. AWS never stores the key. This offers maximum control but maximum operational burden—and is rarely the exam answer because it doesn't integrate with SageMaker's managed infrastructure.
| Data State | Default Protection | Enhanced Protection | Exam Signal |
|---|---|---|---|
| S3 training data at rest | SSE-S3 (automatic) | SSE-KMS with CMK | "Customer-managed encryption" |
| SageMaker volume at rest | AWS-managed encryption | KMS CMK specified in job config | "Encrypt training volume" |
| Model artifacts in S3 | SSE-S3 | SSE-KMS with CMK | "Protect model intellectual property" |
| Data in transit | TLS 1.2 (automatic) | VPC + TLS | "Secure data transfer" |
| SageMaker inter-node training | Encrypted by default | Additional encryption available | "Distributed training security" |
AWS Secrets Manager and AWS Systems Manager Parameter Store complement KMS by managing secrets like database credentials and API keys that ML pipelines need. The exam tests whether you know to store credentials in Secrets Manager (not environment variables or code) and reference them securely from SageMaker jobs.
⚠️ Exam Trap: "Encrypt the data" is rarely the complete answer. The exam usually asks how to encrypt and who controls the keys. If the scenario mentions regulatory requirements or customer data, the answer almost always involves KMS with customer-managed keys (CMK)—not default AWS-managed encryption.
Reflection Question: An ML team stores training data in S3 and model artifacts in the SageMaker Model Registry. Their compliance team requires that all data be encrypted with keys the company controls, and that key access be auditable. What AWS services and configurations achieve this?