2.3. Troubleshooting Monitoring and Logging
Imagine investigating a breach and discovering that your CloudTrail logs stopped delivering three weeks ago — silently, with no alarm. That gap looks identical to a quiet environment until the moment you need those logs and find they were never recorded. A misconfigured CloudWatch agent that silently stops sending logs looks identical to a quiet environment — until you need those logs during an incident and discover they were never recorded. Think of logging troubleshooting like checking that your smoke detectors have working batteries: the check itself takes 5 minutes, but discovering your detector was dead during a fire is catastrophic. The exam specifically tests troubleshooting because AWS environments are complex, and misconfigurations are inevitable — what matters is your ability to identify and fix gaps.
This section covers the most common monitoring and logging failures, how to diagnose them, and how to prevent recurrence.
Scenario: During an incident investigation, you discover that Lambda function logs are missing from CloudWatch. The functions have been running for weeks without errors, but there are no log entries. What's the most likely cause?
Reflection Question: Why does the exam dedicate a separate task to troubleshooting detection systems, and what does this tell you about real-world AWS security operations?