Copyright (c) 2025 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

2.2.1.1. Designing for Data Durability and Availability (S3, Cross-Region Replication)

šŸ’” First Principle: Data must be architected to persist through component failures and remain accessible during widespread disasters to ensure business continuity and prevent data loss.

Scenario: A critical e-commerce application stores its product images in "Amazon S3". The company has a strict disaster recovery plan that requires all data to be available in a separate "AWS Region" within minutes in case of a regional disaster.

Data durability (protection against loss) and availability (accessibility) are critical architectural pillars. AWS services are designed with high inherent durability and availability, but architects must configure them correctly.

  • "Amazon S3": An object storage service designed for 99.999999999% (11 nines) of data durability over a year. This is achieved by redundantly storing data across multiple devices in multiple "Availability Zones".
    • "Cross-Region Replication (CRR)": Automatically and asynchronously copies objects between "S3 buckets" in different "AWS Regions".
      • Practical Relevance: Crucial for disaster recovery (if a region fails), compliance requirements (data residency), and reducing latency for global users accessing local copies.
  • "Amazon EBS (Elastic Block Store)": Volumes are replicated within their "Availability Zone", protecting against single component failures.
    • Snapshots: Point-in-time backups of "EBS volumes" stored in "S3", can be copied to other "AZs" or "Regions" for disaster recovery.
  • "Amazon RDS Multi-AZ": A deployment option for "Amazon RDS" that synchronously replicates database instances to a standby in a different "AZ", providing automatic failover and high availability.
  • "Amazon DynamoDB Global Tables": A feature of "Amazon DynamoDB" that provides multi-Region, active-active replication for "NoSQL databases", ensuring continuous availability and low-latency access globally.
Practical Implementation: Enabling S3 Cross-Region Replication via AWS CLI
# This command applies a replication configuration to a source bucket,
# telling it to replicate all objects to a destination bucket in a different region.
# Note: Both source and destination buckets must have versioning enabled.
aws s3api put-bucket-replication \
  --bucket my-source-bucket-us-east-1 \
  --replication-configuration '{
    "Role": "arn:aws:iam::123456789012:role/s3-replication-role",
    "Rules": [
      {
        "ID": "ReplicateAll",
        "Status": "Enabled",
        "Priority": 1,
        "DeleteMarkerReplication": { "Status": "Disabled" },
        "Filter" : { "Prefix": ""},
        "Destination": {
          "Bucket": "arn:aws:s3:::my-destination-bucket-us-west-2"
        }
      }
    ]
  }'
Visual: Data Durability & Availability (S3 CRR, RDS Multi-AZ)
Loading diagram...

āš ļø Common Pitfall: Assuming "S3 Standard's" 11 nines of durability protects against a regional disaster. While extremely durable within a region, only "Cross-Region Replication (CRR)" provides disaster recovery capability for "S3" data in another region.

Key Trade-Offs:
  • Synchronous Replication (HA) vs. Asynchronous Replication (DR): Synchronous replication (like "RDS Multi-AZ") provides zero data loss ("RPO=0") for high availability within a region but can add latency. Asynchronous replication (like "S3 CRR") is used for disaster recovery across regions and has a non-zero "RPO", but doesn't impact write performance.

Reflection Question: How would you configure "Amazon S3" for this scenario to meet the cross-region disaster recovery requirement, ensuring both high durability and rapid availability of product images even in case of a regional disaster, and what are the associated "RPO" implications?