3.1.2.2. Design for Azure Blob Storage
💡 First Principle: A massively scalable, highly durable, and cost-effective object storage service is the foundational platform for storing and managing vast amounts of unstructured data in the cloud.
Scenario: You are designing a data storage solution for a content management system. It will store user-uploaded images and videos that are frequently accessed initially but become less popular over time. You also need to retain these files for 10 years for compliance reasons at the lowest possible cost, and ensure they are resilient to regional disasters.
Azure Blob Storage is Microsoft’s object storage solution for storing large amounts of unstructured data.
Key Design Considerations:
- Access Tiers:
- Data Redundancy:
- LRS (Locally Redundant Storage): Replicates data within a single datacenter.
- ZRS (Zone-Redundant Storage): Replicates across 3 datacenters in a region.
- GRS (Geo-Redundant Storage): Replicates to a secondary Region for disaster recovery.
- RA-GRS (Read-Access GRS): Adds read access to the secondary Region for GRS.
- Security:
- Use Shared Access Signatures (SAS) for scoped, time-limited access.
- Integrate with Azure AD for identity-based controls.
- Data is encrypted at rest by default.
⚠️ Common Pitfall: Ignoring retrieval costs. Moving data to a cheaper storage tier like Archive saves on storage costs, but frequent retrieval can make it more expensive overall than keeping it in the Cool tier due to higher per-GB retrieval fees.
Key Trade-Offs:
- Storage Cost vs. Retrieval Cost/Time: Lower storage costs (e.g., Archive tier) come with higher retrieval costs and longer retrieval times (hours).
Reflection Question: How does designing for Azure Blob Storage, by leveraging its access tiers (Hot, Cool, Archive) and data redundancy options (e.g., GRS), fundamentally enable highly available, durable, and massively scalable object storage for unstructured data, optimizing for cost and compliance across diverse workloads?