3.2.1. Elastic Load Balancing: ALB, NLB, and Target Groups
ðĄ First Principle: A load balancer does two things: distributes traffic and detects failures. The distribution part is obvious â spread requests across multiple instances so no single instance is overwhelmed. The failure detection part is equally critical â the load balancer continuously health-checks its targets and stops sending traffic to unhealthy ones, automatically, without human intervention.
Load Balancer Types:
| Feature | ALB (Application) | NLB (Network) | GWLB (Gateway) |
|---|---|---|---|
| Layer | 7 (HTTP/HTTPS) | 4 (TCP/UDP/TLS) | 3 (IP packets) |
| Routing | Path, host, header, query string, source IP | Port only | â |
| Use case | Web apps, microservices, gRPC | Ultra-low latency, TCP, static IP | Third-party security appliances |
| WebSocket | â | â | â |
| Static IP | â (DNS only) | â (per AZ) | â |
| Latency | ~ms | ~Ξs | â |
ALB Routing Rules: ALB can route based on multiple conditions in priority order:
| Condition Type | Example |
|---|---|
| Path | /api/* â API target group; /static/* â S3 |
| Host header | api.example.com â API group; app.example.com â App group |
| HTTP method | GET â read replicas; POST â primary |
| Query string | ?version=v2 â new target group |
| Source IP | Internal IPs â admin group |
| HTTP header | Custom headers for A/B testing |
Target Groups: A target group is the set of registered targets that receive traffic from a rule. Targets can be:
- EC2 instances (by instance ID)
- IP addresses (on-premises servers, containers with specific IPs)
- Lambda functions (ALB can invoke Lambda directly)
- ALB (NLB forwarding to ALB â for hybrid architectures)
Health Checks: Each target group has independent health check configuration:
- Protocol, port, path: Where to check (e.g., HTTP on port 8080 at
/health) - Healthy threshold: Consecutive successes before marking healthy (default: 5)
- Unhealthy threshold: Consecutive failures before marking unhealthy (default: 2)
- Interval: Time between checks (default: 30s)
Cross-Zone Load Balancing: When enabled, each load balancer node distributes traffic equally across all registered targets in all AZs. When disabled, each node distributes traffic only among targets in its AZ.
| Setting | ALB | NLB |
|---|---|---|
| Cross-zone enabled by default | â | â (disabled; extra cost) |
Connection Draining (Deregistration Delay): When a target is deregistered (scale-in, deployment), the load balancer stops sending new requests to it but allows in-flight requests to complete. The deregistration delay (default: 300 seconds) gives in-flight requests time to finish before the instance is terminated.
Sticky Sessions: The load balancer uses a cookie to bind a user's session to a specific target. Subsequent requests from that user go to the same instance. Used for stateful applications that store session data locally. Best practice is to avoid this â use ElastiCache for session storage instead and make instances stateless.
â ïļ Exam Trap: NLB preserves the source IP address of the client â the target sees the actual client IP. ALB does not â the target sees the ALB's IP. The original client IP is available in the X-Forwarded-For HTTP header. This matters for security groups: NLB targets must allow traffic from client IP ranges; ALB targets only need to allow traffic from the ALB's security group.
Reflection Question: A financial services application requires a static IP address for whitelisting by enterprise clients. The application is HTTP-based and needs path-based routing. Which load balancer combination do you use, and why?