2.2.1.1. Designing for Data Durability and Availability (S3, Cross-Region Replication)
2.2.1.1. Designing for Data Durability and Availability (S3, Cross-Region Replication)
š” First Principle: Data must be architected to persist through component failures and remain accessible during widespread disasters to ensure business continuity and prevent data loss.
Scenario: A critical e-commerce application stores its product images in "Amazon S3". The company has a strict disaster recovery plan that requires all data to be available in a separate "AWS Region" within minutes in case of a regional disaster.
Data durability (protection against loss) and availability (accessibility) are critical architectural pillars. AWS services are designed with high inherent durability and availability, but architects must configure them correctly.
- "Amazon S3": An object storage service designed for
99.999999999%(11 nines) of data durability over a year. This is achieved by redundantly storing data across multiple devices in multiple"Availability Zones".- "Cross-Region Replication (CRR)": Automatically and asynchronously copies objects between
"S3 buckets"in different"AWS Regions".- Practical Relevance: Crucial for disaster recovery (if a region fails), compliance requirements (data residency), and reducing latency for global users accessing local copies.
- "Cross-Region Replication (CRR)": Automatically and asynchronously copies objects between
- "Amazon EBS (Elastic Block Store)": Volumes are replicated within their
"Availability Zone", protecting against single component failures.- Snapshots: Point-in-time backups of
"EBS volumes"stored in"S3", can be copied to other"AZs"or"Regions"for disaster recovery.
- Snapshots: Point-in-time backups of
- "Amazon RDS Multi-AZ": A deployment option for
"Amazon RDS"that synchronously replicates database instances to a standby in a different"AZ", providing automatic failover and high availability. - "Amazon DynamoDB Global Tables": A feature of
"Amazon DynamoDB"that provides multi-Region, active-active replication for"NoSQL databases", ensuring continuous availability and low-latency access globally.
Practical Implementation: Enabling S3 Cross-Region Replication via AWS CLI
# This command applies a replication configuration to a source bucket,
# telling it to replicate all objects to a destination bucket in a different region.
# Note: Both source and destination buckets must have versioning enabled.
aws s3api put-bucket-replication \
--bucket my-source-bucket-us-east-1 \
--replication-configuration '{
"Role": "arn:aws:iam::123456789012:role/s3-replication-role",
"Rules": [
{
"ID": "ReplicateAll",
"Status": "Enabled",
"Priority": 1,
"DeleteMarkerReplication": { "Status": "Disabled" },
"Filter" : { "Prefix": ""},
"Destination": {
"Bucket": "arn:aws:s3:::my-destination-bucket-us-west-2"
}
}
]
}'
Visual: Data Durability & Availability (S3 CRR, RDS Multi-AZ)
Loading diagram...
ā ļø Common Pitfall: Assuming "S3 Standard's" 11 nines of durability protects against a regional disaster. While extremely durable within a region, only "Cross-Region Replication (CRR)" provides disaster recovery capability for "S3" data in another region.
Key Trade-Offs:
- Synchronous Replication (HA) vs. Asynchronous Replication (DR): Synchronous replication (like
"RDS Multi-AZ") provides zero data loss ("RPO=0") for high availability within a region but can add latency. Asynchronous replication (like"S3 CRR") is used for disaster recovery across regions and has a non-zero"RPO", but doesn't impact write performance.
Reflection Question: How would you configure "Amazon S3" for this scenario to meet the cross-region disaster recovery requirement, ensuring both high durability and rapid availability of product images even in case of a regional disaster, and what are the associated "RPO" implications?
