3.1.1. EC2 Auto Scaling: Policies and Lifecycle Hooks
š” First Principle: Auto Scaling is not a reactive firefighting tool ā it's a demand prediction and capacity management system. The goal is to have the right number of instances already running before load arrives, not to scramble to add capacity after users are already experiencing degradation.
An Auto Scaling Group (ASG) maintains a fleet of EC2 instances between a minimum and maximum count. It continuously monitors health and replaces unhealthy instances automatically ā that alone provides resilience without any scaling configuration at all.
Launch Templates vs. Launch Configurations:
| Feature | Launch Template | Launch Configuration |
|---|---|---|
| Current/recommended | ā Yes | ā Legacy |
| Versioning | ā Yes | ā No |
| Multiple instance types | ā Yes | ā No |
| Spot + On-Demand mix | ā Yes | ā No |
| T2/T3 unlimited | ā Yes | ā No |
Always use Launch Templates for new ASGs. Launch Configurations are legacy and can't be updated ā every change requires creating a new one.
Scaling Policies:
| Policy Type | How It Works | Best For |
|---|---|---|
| Target Tracking | Maintain a target metric value (e.g., keep CPU at 50%) | Most workloads; simplest to configure |
| Step Scaling | Add/remove instances in steps based on alarm breach magnitude | Workloads needing fine-grained control |
| Simple Scaling | One action per alarm breach; cooldown before next action | Legacy; target tracking is better |
| Scheduled Scaling | Pre-scale at defined times (e.g., before business hours) | Known traffic patterns |
| Predictive Scaling | ML forecasting based on historical patterns; launches capacity ahead of predicted load | Recurring patterns (daily peaks, weekly spikes) |
Target Tracking is the recommended default. You specify a target for a metric (e.g., ASGAverageCPUUtilization = 50%), and Auto Scaling continuously adjusts capacity to keep the metric near that target. It handles scale-out and scale-in automatically.
Cooldown Period: After a scaling action, Auto Scaling waits during the cooldown (default: 300 seconds for simple scaling) before evaluating whether another scaling action is needed. This prevents rapid oscillation. Target tracking and step scaling use warm-up times instead of cooldowns ā new instances are given time to warm up before their metrics count toward the target.
Lifecycle Hooks intercept instance launches and terminations, giving you time to run custom actions:
Launch lifecycle hook use cases: install software, pull config from Parameter Store, register with a load balancer or service discovery, run configuration management.
Termination lifecycle hook use cases: drain in-progress work, deregister from service discovery, upload logs to S3, send final metrics.
The hook has a heartbeat timeout (default: 3600 seconds, max: 172800 seconds). If your custom action doesn't complete a complete-lifecycle-action call within the timeout, the default result (CONTINUE or ABANDON) is applied.
Warm Pools (for applications with long initialization times): Pre-initialize a pool of stopped instances so they're ready to enter service quickly. Instead of waiting 10 minutes for a new instance to boot, install software, and warm up ā a pre-initialized instance starts in seconds.
ā ļø Exam Trap: Auto Scaling health checks default to EC2 status checks only. If an instance is running but your application is unhealthy (e.g., the web server crashed), EC2 status checks still pass ā the instance stays in the ASG. To detect application-level failures, configure the ASG to use ELB health checks instead. ELB health checks verify that the instance is responding correctly to application requests.
Reflection Question: Your application takes 8 minutes to fully initialize after an EC2 instance launches (JVM warm-up, cache pre-loading). Target tracking scaling is set for CPU ā„ 60%. What two ASG features do you configure to ensure new instances don't receive traffic before they're ready and that pre-initialized instances are available quickly?