Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.
3.1.1.6. Identifying & Remediating Single Points of Failure
3.1.1.6. Identifying & Remediating Single Points of Failure
A single point of failure (SPOF) is any component whose failure takes down the entire system. Finding them requires tracing every request path and asking "what if this fails?"
Common SPOFs and remediations:
| SPOF | Remediation |
|---|---|
| Single EC2 instance (no ASG) | Place behind ASG with min=2 across AZs |
| Single-AZ RDS | Enable Multi-AZ deployment |
| Single NAT Gateway | Deploy one NAT GW per AZ |
| Hardcoded IP addresses | Use DNS names, ELB, or Elastic IPs with failover |
| Single region deployment | Add standby region with Route 53 failover |
| Application storing state locally | Move state to DynamoDB, ElastiCache, or EFS |
SPOF detection methods:
- AWS Well-Architected Tool: Automated review against Reliability Pillar best practices
- Chaos engineering: Inject failures (terminate instances, block network) and observe behavior. AWS Fault Injection Simulator (FIS) provides managed chaos experiments.
- Architecture reviews: Trace every request through every component. If any single component's failure causes user impact, it's a SPOF.
# AWS FIS: Terminate random EC2 instances in an ASG to test resilience
aws fis create-experiment-template \
--description "Test ASG self-healing" \
--targets '{"instances":{"resourceType":"aws:ec2:instance","selectionMode":"COUNT(1)","filters":[{"path":"State.Name","values":["running"]}]}}' \
--actions '{"terminateInstance":{"actionId":"aws:ec2:terminate-instances","targets":{"Instances":"instances"}}}'
Exam Trap: A NAT Gateway is a regional service but operates within a single AZ. If you have private subnets in 3 AZs routing through a single NAT GW in AZ-A, and AZ-A fails, all three subnets lose internet access. Deploy one NAT GW per AZ and configure route tables accordingly.

Written byAlvin Varughese•Founder•15 professional certifications