5.5.2. Governance Frameworks: Config, SageMaker Catalog, and Data Sharing
š” First Principle: Governance is not a one-time setup ā it's continuous monitoring and enforcement. AWS Config provides continuous configuration compliance monitoring, SageMaker Catalog provides data governance workflows, and Redshift data sharing enables governed cross-team access. Together they form a governance framework that scales with the organization.
AWS Config continuously records AWS resource configurations and evaluates them against rules. For data engineering: detect unencrypted S3 buckets (managed rule s3-bucket-server-side-encryption-enabled), detect publicly accessible buckets (s3-bucket-public-read-prohibited), and detect unencrypted Redshift clusters. Config remediation actions can automatically fix non-compliant resources.
Amazon SageMaker Catalog (v1.1) provides governance through projects and data access management. Data producers publish datasets to the catalog with metadata, classifications, and access policies. Data consumers discover and request access through a governed workflow ā the producer approves or denies based on the request's justification and the consumer's attributes.
Amazon Redshift data sharing allows a Redshift cluster (producer) to share live data with other Redshift clusters (consumers) across accounts and regions ā without copying data. The producer controls what is shared and can revoke access. This enables governed cross-team analytics without data duplication.
Governance data frameworks (v1.1) encompass the policies, processes, and tools that ensure data quality, security, privacy, and compliance across the organization. The exam tests whether you understand the components: data classification (Macie), access control (Lake Formation, IAM), audit (CloudTrail, Config), privacy (masking, sovereignty), and discovery (SageMaker Catalog).
ā ļø Exam Trap: AWS Config records configuration changes ā it doesn't actively block non-compliant actions (that's what SCPs and IAM policies do). Config detects problems after they occur. If a question asks "prevent unencrypted buckets from being created," the answer is an SCP or IAM policy, not Config. If it asks "detect and alert on unencrypted buckets," the answer is Config.
Reflection Question: A compliance officer asks: "Can you prove that all S3 buckets containing customer data have been encrypted at all times for the past year?" Which AWS services together provide this proof?