Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

5.4.1. CloudTrail and CloudWatch Logs for Auditing

šŸ’” First Principle: CloudTrail logs AWS API activity (management plane), while CloudWatch Logs capture application output (data plane). Together they provide a complete audit trail: CloudTrail shows "User X called StartGlueJob at 3:15 AM" and CloudWatch Logs show "the Glue job processed 1.2 million records from the customers table." You need both for thorough auditing.

CloudTrail configuration for data engineering: Enable management events (on by default) and data events (S3 object-level access, Lambda invocations — must be explicitly enabled). Store trails in S3 with SSE-KMS encryption and Object Lock for tamper-proof retention. Multi-region trails capture events across all regions.

CloudWatch Logs for application auditing: Configure log retention (1 day to 10 years), use metric filters to extract key values (error counts, processing volumes), and enable encryption with KMS. Export logs to S3 for long-term archival and cross-reference with CloudTrail data. Subscription filters can stream log events in real-time to Lambda (for alerting), Kinesis (for aggregation), or OpenSearch (for interactive search).

Combining CloudTrail and CloudWatch Logs provides the complete audit picture. When investigating a data incident: CloudTrail reveals which IAM principal triggered the pipeline (management plane), while CloudWatch Logs reveal what the pipeline actually processed (data plane). For example, CloudTrail shows "Role X started Glue Job Y at 3:15 AM" while CloudWatch Logs show "Glue Job Y read 5 million records from the customers table and wrote 4.9 million to the output — 100,000 records failed validation."

Audit log integrity: CloudTrail log file validation creates a digitally signed digest file every hour. Verifying the digest confirms that logs haven't been tampered with — critical for compliance and legal proceedings.

āš ļø Exam Trap: CloudTrail data events for S3 can generate enormous volumes of log data (every GetObject, PutObject call). Enable data events selectively — only for buckets containing sensitive data, not for all buckets. The cost and volume of logging everything can be significant.

Reflection Question: A compliance audit requires proving that no unauthorized user accessed the PII bucket in the last 12 months. Which AWS service provides this data, what event type must be enabled, and how do you prove the logs are tamper-proof?

Alvin Varughese
Written byAlvin Varughese
Founder•15 professional certifications