3.1.2. Design for Non-Relational Data
đź’ˇ First Principle: Non-relational (NoSQL) databases provide the flexibility, massive scalability, and high performance required by modern applications to handle diverse, evolving, or unstructured data where a rigid schema is a constraint.
Scenario: You are designing a data solution for a new mobile application that will store user-generated content (photos, text posts) and user profiles. The user profiles have a flexible schema that may evolve, and the application needs to operate globally with low latency.
Non-relational (NoSQL) data solutions are designed to address the need for flexibility, scalability, and rapid adaptation in modern cloud applications. Unlike relational databases, they do not require a fixed schema, making them ideal for handling diverse, evolving, or unstructured data—such as documents, key-value pairs, graphs, or large binary objects.
Azure’s managed non-relational data services:
- Azure Cosmos DB: A globally distributed, multi-model database supporting document, key-value, graph, and column-family data.
- Azure Blob Storage: Optimized for storing massive amounts of unstructured data (e.g., images, videos, backups).
- Azure Table Storage: A key-value store for semi-structured data, suitable for large-scale, low-cost storage.
- Azure Queue Storage: Provides reliable message queuing for decoupling and scaling distributed application components.
Key design considerations:
- Consistency models: Cosmos DB offers five levels—Strong, Bounded Staleness, Session, Consistent Prefix, and Eventual—allowing you to balance latency, availability, and data accuracy.
- Partitioning: Essential for scaling; data is distributed across partitions using a partition key. Good partitioning ensures even data distribution and high throughput.
- Data modeling: NoSQL favors denormalization and embedding related data to optimize for read performance and scalability.
⚠️ Common Pitfall: Applying relational database design principles (like normalization) to a NoSQL database. This often leads to inefficient queries and negates the performance benefits of the NoSQL model.
Key Trade-Offs:
- Schema Flexibility vs. Data Integrity: NoSQL databases offer flexible schemas, which is great for agile development, but this flexibility means data integrity must be enforced at the application layer rather than by the database itself.
Reflection Question: How does aligning your data model and service choice (Azure Cosmos DB for global profiles, Azure Blob Storage for images) with the application's flexibility, scale, and consistency requirements fundamentally enable you to design a high-performing and scalable non-relational data solution in Azure?