3.1.2.5. Identifying & Remediating Scaling Issues

Your auto-scaling is configured, but the application still crashes under load. Scaling issues fall into predictable categories.

Common scaling failures and fixes:

Symptom	Root Cause	Fix
New instances launch but can't serve traffic	Long bootstrap time (install dependencies at startup)	Bake AMI with pre-installed deps
Scale-out happens but too slowly	Scaling metric responds slowly to load	Use a leading indicator metric (queue depth, request count)
Instances scale out then immediately scale in	Thrashing — scale-in and scale-out thresholds too close	Add cooldown period (300s) or widen threshold gap
Hit max capacity but load continues growing	Max capacity too low	Increase ASG max, or redesign for horizontal scalability
Database becomes bottleneck at scale	Compute scales but data tier doesn't	Add read replicas, ElastiCache, or DynamoDB on-demand
API Gateway returns 429	Throttling limit reached	Request limit increase, implement caching, use usage plans

Scaling warmup and cooldown:

Instance warmup: Time for a new instance to start serving traffic. ASG ignores the instance's metrics during warmup to prevent premature scale-in.
Cooldown period: After a scaling action, wait before evaluating again. Prevents thrashing.

Launch template optimization: Use launch templates (not launch configurations — deprecated) with:

Pre-baked AMIs (faster bootstrap)
Instance type diversity (Spot + On-Demand mix)
Placement groups for low-latency communication

Exam Trap: ASG target tracking scaling already handles cooldown intelligently — it prevents scale-in while scale-out is in progress. But step scaling uses a fixed cooldown timer. If the exam describes "instances scale out and immediately scale back in" with step scaling, increase the cooldown period. With target tracking, this problem rarely occurs.

Written byAlvin Varughese•Founder•15 professional certifications