Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

3.2.1. Elastic Load Balancing: ALB, NLB, and Target Groups

ðŸ’Ą First Principle: A load balancer does two things: distributes traffic and detects failures. The distribution part is obvious — spread requests across multiple instances so no single instance is overwhelmed. The failure detection part is equally critical — the load balancer continuously health-checks its targets and stops sending traffic to unhealthy ones, automatically, without human intervention.

Load Balancer Types:
FeatureALB (Application)NLB (Network)GWLB (Gateway)
Layer7 (HTTP/HTTPS)4 (TCP/UDP/TLS)3 (IP packets)
RoutingPath, host, header, query string, source IPPort only—
Use caseWeb apps, microservices, gRPCUltra-low latency, TCP, static IPThird-party security appliances
WebSocket✅✅—
Static IP❌ (DNS only)✅ (per AZ)—
Latency~ms~ξs—

ALB Routing Rules: ALB can route based on multiple conditions in priority order:

Condition TypeExample
Path/api/* → API target group; /static/* → S3
Host headerapi.example.com → API group; app.example.com → App group
HTTP methodGET → read replicas; POST → primary
Query string?version=v2 → new target group
Source IPInternal IPs → admin group
HTTP headerCustom headers for A/B testing

Target Groups: A target group is the set of registered targets that receive traffic from a rule. Targets can be:

  • EC2 instances (by instance ID)
  • IP addresses (on-premises servers, containers with specific IPs)
  • Lambda functions (ALB can invoke Lambda directly)
  • ALB (NLB forwarding to ALB — for hybrid architectures)

Health Checks: Each target group has independent health check configuration:

  • Protocol, port, path: Where to check (e.g., HTTP on port 8080 at /health)
  • Healthy threshold: Consecutive successes before marking healthy (default: 5)
  • Unhealthy threshold: Consecutive failures before marking unhealthy (default: 2)
  • Interval: Time between checks (default: 30s)

Cross-Zone Load Balancing: When enabled, each load balancer node distributes traffic equally across all registered targets in all AZs. When disabled, each node distributes traffic only among targets in its AZ.

SettingALBNLB
Cross-zone enabled by default✅❌ (disabled; extra cost)

Connection Draining (Deregistration Delay): When a target is deregistered (scale-in, deployment), the load balancer stops sending new requests to it but allows in-flight requests to complete. The deregistration delay (default: 300 seconds) gives in-flight requests time to finish before the instance is terminated.

Sticky Sessions: The load balancer uses a cookie to bind a user's session to a specific target. Subsequent requests from that user go to the same instance. Used for stateful applications that store session data locally. Best practice is to avoid this — use ElastiCache for session storage instead and make instances stateless.

⚠ïļ Exam Trap: NLB preserves the source IP address of the client — the target sees the actual client IP. ALB does not — the target sees the ALB's IP. The original client IP is available in the X-Forwarded-For HTTP header. This matters for security groups: NLB targets must allow traffic from client IP ranges; ALB targets only need to allow traffic from the ALB's security group.

Reflection Question: A financial services application requires a static IP address for whitelisting by enterprise clients. The application is HTTP-based and needs path-based routing. Which load balancer combination do you use, and why?

Alvin Varughese
Written byAlvin Varughese
Founderâ€Ē15 professional certifications