AWS-SAP-C02 & AWS CERTIFICATION | Designing for Relational Database Workloads (Scalability, HA, DR) - AWS Certified Solutions Architect

2.4.1.1. Designing for Relational Database Workloads (Scalability, HA, DR)

💡 First Principle: Relational databases must be architected to maintain transactional integrity ("ACID") while providing high availability, effective read scaling, and robust disaster recovery to support critical "OLTP" workloads.

Scenario: A critical online banking application uses "Amazon RDS for PostgreSQL". The application experiences frequent read spikes during business hours, and there's a strict requirement for high availability with minimal downtime in case of database instance failure. Additionally, a disaster recovery plan is needed in a separate "AWS Region" with minimal data loss.

Relational databases remain central for many applications, especially those requiring strong transactional consistency ("ACID" properties).

Scalability:
- Vertical Scaling: Increasing compute/memory (e.g., moving to a larger "RDS instance type"). Simple but limited.
- Read Replicas ("RDS"/"Aurora"): Asynchronous replication of data to separate instances for read-heavy workloads, offloading the primary. Improves read performance and can be cross-"AZ" or cross-"Region" for "DR".
- Aurora Scaling: "Aurora" automatically scales its storage and leverages up to 15 read replicas, including "Aurora Serverless" for on-demand capacity.
- Sharding (Application-level): Partitioning data across multiple database instances, managed by the application. Complex to implement but offers extreme horizontal scaling.
High Availability ("HA"):
- "Multi-AZ Deployment (RDS/Aurora)": Synchronously replicates data to a standby instance in a different "AZ". Provides automatic failover (minutes) in case of primary instance failure or "AZ" outage. No data loss ("RPO=0").
Disaster Recovery ("DR"):
- Cross-Region Read Replicas ("RDS"/"Aurora"): Asynchronously replicate data to a read replica in a different "AWS Region". In a regional disaster, this replica can be promoted to a standalone primary, providing a robust "DR" strategy. "RPO" > 0, "RTO" in minutes.
- Snapshots: Automated and manual snapshots of "RDS"/"Aurora" instances stored in "S3", can be copied cross-"Region". Higher "RTO"/"RPO" than active replication.

Visual: RDS Scalability, HA, DR Design

Loading diagram...

⚠️ Common Pitfall: Using a "Read Replica" for high availability. A "Read Replica" is for read scaling. If the primary database fails, you must manually promote the replica, which causes downtime and potential data loss due to asynchronous replication lag. A "Multi-AZ" deployment is the correct solution for automatic, synchronous failover.

Key Trade-Offs:

High Availability ("Multi-AZ") vs. Read Scaling ("Read Replicas"): "Multi-AZ" is for failover and doesn't improve read performance (the standby is not readable). "Read Replicas" are for offloading read traffic and are not a primary "HA" solution.

Reflection Question: How would you combine "Amazon RDS Read Replicas" and "Multi-AZ" deployment (including cross-region) to meet the scalability, high availability, and disaster recovery requirements for a critical online banking application that experiences frequent read spikes and demands minimal downtime?