Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.
3.1.2.1. Appropriate Metrics for Scaling Services
3.1.2.1. Appropriate Metrics for Scaling Services
Scaling on the wrong metric wastes money (premature scale-out) or causes outages (delayed scale-out). The best metric directly reflects user experience.
Metric selection by workload type:
| Workload | Best Scaling Metric | Why |
|---|---|---|
| Web servers | ALBRequestCountPerTarget | Directly measures user load |
| API servers | Target response time (p95 latency) | Detects degradation before failures |
| Queue processors | ApproximateNumberOfMessagesVisible (SQS) | Matches processing capacity to queue depth |
| Batch processing | CPU utilization | Compute-bound workloads |
| In-memory cache | Memory utilization | Prevents evictions |
| GPU workloads | GPU utilization (custom metric) | Matches GPU demand |
Target tracking vs. step scaling:
- Target tracking: Set a target value (e.g., CPU = 50%). ASG automatically adjusts capacity to maintain the target. Simplest and recommended for most cases.
- Step scaling: Define thresholds with specific capacity adjustments (e.g., CPU > 70% → add 2, CPU > 90% → add 4). More control, more complexity.
- Scheduled scaling: Set capacity based on known traffic patterns (e.g., scale up at 8 AM, down at 8 PM). Use alongside dynamic scaling.
Custom metrics published to CloudWatch enable scaling on application-specific indicators:
# Publish custom metric: active WebSocket connections
aws cloudwatch put-metric-data \
--namespace "MyApp" \
--metric-name "ActiveConnections" \
--value 1250 \
--dimensions InstanceId=i-1234567890abcdef0
Exam Trap: CPU utilization is a poor scaling metric for I/O-bound workloads (APIs waiting on database queries). CPU stays low while requests queue up and latency spikes. Use ALBRequestCountPerTarget or custom latency metrics instead. If the exam describes "low CPU but high latency," the answer is changing the scaling metric, not adjusting the CPU threshold.

Written byAlvin Varughese•Founder•15 professional certifications