6.3.3. Sensitive Data Masking
First Principle: Data masking protects sensitive information in logs and messages by replacing it with redacted or tokenized values — preventing accidental exposure of PII, credentials, and financial data in operational data streams.
This is new content for the SCS-C03.
CloudWatch Logs Data Protection Policies:
- Automatically detect and mask sensitive data in log streams
- Supports detection of: credit card numbers, SSNs, email addresses, IP addresses, and other PII patterns
- Matched data is replaced with configurable masks (e.g.,
***REDACTED***) - Apply at the log group level — all future log events are scanned
- Audit findings sent to Security Hub or S3
Amazon SNS Message Data Protection:
- Scan messages flowing through SNS topics for sensitive data
- Mask, redact, or block messages containing PII before delivery to subscribers
- Useful for event-driven architectures where sensitive data might flow through notification channels
- Apply policies to specific topics
Masking vs. Encryption:
| Approach | Data Recoverable? | Use Case |
|---|---|---|
| Encryption | Yes (with key) | Data protection at rest/transit |
| Masking (redaction) | No | Preventing sensitive data in logs |
| Tokenization | Yes (with token vault) | Replacing sensitive values in non-production |
⚠️ Exam Trap: Data masking in CloudWatch Logs is preventive — it redacts data before it's stored/viewed. Macie is detective — it discovers sensitive data that's already stored. If a question asks about preventing PII from appearing in logs, CloudWatch data protection policies are the answer.
Scenario: A debugging session reveals that application logs contain customer credit card numbers in plaintext. You apply a CloudWatch Logs data protection policy that detects credit card patterns and replaces them with ***MASKED***. Future log entries automatically redact this data.
Reflection Question: Why is log-level data masking necessary even when the underlying application data is encrypted at rest?