AWS-SAP-C02 & AWS CERTIFICATION | Evaluating Database Options (RDS, Aurora, DynamoDB, Redshift, DocumentDB, ElastiCache, Neptune, QLDB, Timestream) - AWS Certified Solutions Architect

2.4.1. Evaluating Database Options (RDS, Aurora, DynamoDB, Redshift, DocumentDB, ElastiCache, Neptune, QLDB, Timestream)

💡 First Principle: Optimal database selection is achieved by matching the data model, query patterns, and consistency requirements of a workload to the specialized capabilities of a purpose-built database service.

Scenario: A new social media application needs to store user profiles (flexible schema, high read/write volume), friend connections (complex relationships), and a history of all user actions (immutable and verifiable).

AWS provides a highly diversified portfolio of managed database services, each optimized for specific workloads.

"Amazon RDS (Relational Database Service)": A managed service that simplifies setting up, operating, and scaling relational databases in the cloud. Managed relational databases ("MySQL", "PostgreSQL", "SQL Server", "Oracle", "MariaDB").
- Why: Traditional "OLTP" workloads, strong "ACID" compliance. Reduces operational overhead of self-managed DBs.
"Amazon Aurora": "MySQL" and "PostgreSQL"-compatible relational database built for the cloud.
- Why: 5x faster than "MySQL", 3x faster than "PostgreSQL". Highly available, auto-scaling storage, high durability. Ideal for high-performance relational needs.
"Amazon DynamoDB": A fully managed "NoSQL" key-value and document database.
- Why: High-performance, highly scalable, single-digit millisecond latency at any scale. Ideal for mobile, web, gaming, "IoT", microservices, and applications needing flexible schema with high throughput.
"Amazon Redshift": A fast, fully managed, petabyte-scale data warehouse.
- Why: "OLAP" workloads, large-scale analytics, business intelligence.
"Amazon DocumentDB": A managed "MongoDB"-compatible document database.
- Why: For "MongoDB" workloads, offers scalability and enterprise features.
"Amazon ElastiCache": An in-memory caching service. Supports "Redis" and "Memcached".
- Why: Reduces database load, improves application response times for frequently accessed data.
"Amazon Neptune": A fully managed graph database.
- Why: For highly connected datasets (social networks, recommendation engines, fraud detection).
"Amazon QLDB (Quantum Ledger Database)": A fully managed ledger database for immutable, cryptographically verifiable transaction logs.
- Why: For systems requiring auditable and traceable data changes (financial records, supply chain).
"Amazon Timestream": A fully managed, serverless time series database.
- Why: For "IoT" and operational applications that need to store and analyze time-series data at scale.

Visual: AWS Database Service Options & Use Cases

Loading diagram...

⚠️ Common Pitfall: Choosing a database based on familiarity instead of the data model. For example, trying to model a highly connected social graph in a relational database, which would lead to complex, slow, and expensive recursive joins, when a graph database like "Neptune" is the purpose-built solution.

Key Trade-Offs:

Schema Flexibility vs. Data Integrity: "NoSQL" databases (like "DynamoDB") offer flexible schemas, which is great for agile development, but relational databases (like "RDS"/"Aurora") enforce a rigid schema, which can be better for ensuring data integrity.

Reflection Question: How would you design the database solution for this social media application using a combination of "Amazon DynamoDB", "Amazon Neptune", and "Amazon QLDB" to meet its diverse data model (user profiles, friend connections, user actions) and consistency requirements?