2.4.4. Shared Storage: EFS and FSx Selection and Tuning
š” First Principle: EBS gives you a dedicated disk attached to one instance. But what if multiple instances need to share the same files? Shared file systems solve the concurrent access problem ā multiple servers reading and writing to the same storage, with each seeing the same data in real time.
This matters for applications that weren't designed for distributed storage: content management systems storing user uploads, application servers sharing configuration files, or batch processing jobs reading from the same input dataset.
Amazon EFS (Elastic File System):
- NFS protocol (Linux only)
- Scales automatically to petabytes ā no capacity planning
- Multi-AZ by default (data replicated across AZs)
- Thousands of concurrent connections from EC2, Lambda, ECS, EKS
EFS Performance Modes:
| Mode | Use Case | Tradeoff |
|---|---|---|
| General Purpose | Default; web serving, content management, home directories | Best latency; some IOPS limit |
| Max I/O | Big data, media processing, >500 clients | Higher aggregate throughput; higher per-operation latency |
EFS Throughput Modes:
| Mode | How Throughput Works |
|---|---|
| Elastic (Recommended) | Automatically scales up and down; pay per GB transferred |
| Bursting | Throughput tied to storage size; earns credits when idle, spends during bursts |
| Provisioned | You specify throughput regardless of storage size; use when you need consistent high throughput with small storage |
EFS Lifecycle Policies move infrequently accessed files to the cheaper EFS Infrequent Access storage class. Files that haven't been accessed in 7, 14, 30, 60, or 90 days (you choose) are automatically moved. Files accessed again are moved back to standard storage.
Amazon FSx ā When EFS Isn't Enough:
| FSx Variant | Protocol | Best For |
|---|---|---|
| FSx for Windows File Server | SMB | Windows applications, Active Directory integration, NTFS |
| FSx for Lustre | Lustre (custom) | High-performance computing, ML training, financial modeling ā maximum throughput |
| FSx for NetApp ONTAP | NFS, SMB, iSCSI | Hybrid cloud; existing NetApp workloads |
| FSx for OpenZFS | NFS | ZFS features, data compression, deduplication |
Key Decision Rule:
ā ļø Exam Trap: FSx for Lustre integrates natively with S3 ā you can configure it to automatically load data from an S3 bucket and write results back to S3. This is the recommended pattern for HPC workloads processing data from S3. EFS does not have this native S3 integration.
Reflection Question: A media company runs Linux-based video transcoding on a cluster of 200 EC2 instances. Source videos are in S3. The cluster needs shared access to partially processed files with throughput exceeding 50 GB/s. Which storage service do you choose and why?