3.1.4. Automated Incident Remediation
First Principle: Human-speed response cannot match machine-speed attacks. Automated remediation executes predefined actions in seconds — containing threats before an engineer can even read the alert.
Automated Forensics Orchestrator for Amazon EC2 (new in C03):
- Pre-built solution for automated forensic capture of compromised EC2 instances
- Triggered by Security Hub findings or GuardDuty alerts
- Automatically: captures EBS snapshots, isolates the instance, creates forensic analysis environment
- Preserves chain of custody for forensic evidence
AWS Step Functions for complex remediation workflows:
- Orchestrate multi-step response procedures as state machines
- Handle branching logic: "if finding severity is HIGH, isolate immediately; if MEDIUM, alert team first"
- Built-in retry logic, error handling, and timeout management
- Visual workflow designer for non-programmers to understand the automation
Lambda for Targeted Automated Response:
Common patterns triggered by EventBridge:
- GuardDuty finding → Lambda → apply quarantine security group
- Config non-compliance → Lambda → enable missing encryption
- IAM Access Analyzer finding → Lambda → attach deny policy to overly permissive role
Application Recovery Controller for large-scale incident recovery:
- Manage recovery across Regions and AZs during major incidents
- Readiness checks verify recovery resources are properly configured
- Routing controls shift traffic between healthy and impaired environments
⚠️ Exam Trap: Step Functions orchestrate multi-step workflows. Lambda handles individual actions. If a question describes a complex response with branching logic and multiple sequential steps, Step Functions is the answer — not a single Lambda function with nested if-else logic.
Scenario: A GuardDuty finding indicates compromised credentials are being used from an unusual location. A Step Functions workflow: (1) revokes active sessions, (2) disables the access key, (3) queries CloudTrail for all actions taken, (4) assesses blast radius, (5) notifies the security team with a complete report — all within 2 minutes of the finding.
Reflection Question: Why does the exam favor Step Functions over a single Lambda function for complex IR automation, and what advantages does the state machine model provide?