Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

5.5.1. PII Detection with Macie and Data Sovereignty

šŸ’” First Principle: You can't protect what you can't find. Amazon Macie uses machine learning to automatically discover and classify sensitive data (PII, financial data, credentials) in S3 buckets. It answers the question "where is our sensitive data?" — the prerequisite for any protection strategy. Data sovereignty, meanwhile, ensures data stays within geographic boundaries required by law.

Amazon Macie scans S3 buckets and identifies sensitive data types: names, addresses, credit card numbers, SSNs, API keys, and 100+ other patterns. Macie generates findings — alerts that describe what was found, where, and how sensitive it is. Findings integrate with EventBridge for automated remediation (e.g., trigger a Lambda function to quarantine the file).

Macie + Lake Formation: Macie discovers PII, then Lake Formation restricts access to the columns containing PII. This is the complete pattern: discover → classify → protect.

Data sovereignty means keeping data within specific geographic regions as required by regulation (GDPR, data localization laws). AWS implementation: configure S3 bucket policies to deny replication to disallowed regions, use SCP (Service Control Policies) to prevent resource creation in non-compliant regions, and enable S3 Block Public Access to prevent unintended exposure.

Preventing cross-region data movement: S3 replication rules can be restricted to specific destination regions. SCPs at the organization level can deny s3:CreateBucket actions in non-compliant regions. AWS Config rules can detect and alert on non-compliant configurations.

āš ļø Exam Trap: Macie only scans S3 — it doesn't scan DynamoDB, RDS, or Redshift. If a question asks about discovering PII across all data stores, the answer may involve exporting data to S3 first, then scanning with Macie, or using a combination of services.

Reflection Question: A European company stores customer data in eu-west-1 but discovers that a backup policy is replicating data to us-east-1, violating GDPR data residency requirements. What AWS services detect and prevent this?

Alvin Varughese
Written byAlvin Varughese
Founder•15 professional certifications