5.2. Data Security and Privacy
💡 First Principle: Data security for GenAI applications must protect data across three states: at rest (in vector stores, S3, databases), in transit (between services and to/from the FM API), and in use (when data is in the FM's context window). Traditional security controls cover the first two; GenAI-specific controls are needed for the third.
The "in use" state is the new challenge. When PII data is in the FM's context window, traditional encryption doesn't apply — the model is actively processing it. The defense at this layer is about what data enters the context (PII detection and redaction before FM invocation) and what data leaves the context (output filtering and PII masking in FM responses).
⚠️
| Data State | Threat | Defense |
|---|---|---|
| At rest (S3, OpenSearch, DynamoDB) | Unauthorized access to stored vectors or documents | KMS encryption, S3 bucket policies, VPC endpoints |
| In transit (Lambda → Bedrock API) | Traffic interception, data leaving AWS network | HTTPS (minimum), VPC endpoints via PrivateLink (required for private) |
| In use (FM context window) | PII in prompt leaks in response; indirect injection | Comprehend PII detection before input; Guardrails PII masking on output |
Common Misconception: Using HTTPS with Bedrock API means your data is fully private. HTTPS encrypts data in transit but does not prevent data from leaving your AWS network perimeter. VPC endpoints (PrivateLink) are required to ensure FM API traffic never traverses the public internet — a distinct requirement from HTTPS encryption.