2.3.3. Throttling, Rate Limits, and Fan-Out Patterns
š” First Principle: Every AWS service has limits ā API call rates, throughput caps, concurrent execution limits ā and hitting them silently degrades or halts your pipeline. Data engineers who ignore limits build pipelines that work in testing and break in production. Understanding throttling is about building pipelines that degrade gracefully rather than fail catastrophically.
Common throttling scenarios tested on the exam:
DynamoDB. With provisioned capacity, writes exceeding the WCU (Write Capacity Unit) limit are throttled. With on-demand capacity, DynamoDB scales automatically but can still throttle if traffic doubles in under 30 minutes. The fix: use exponential backoff, enable DynamoDB auto-scaling, or switch to on-demand mode for unpredictable workloads.
Kinesis Data Streams. Each shard has firm limits: 1 MB/s in, 2 MB/s out, 1,000 records/s in. Exceeding these produces ProvisionedThroughputExceededException. The fix: add shards, improve partition key distribution to avoid hot shards, or switch to on-demand mode.
Lambda concurrency. Default account limit is 1,000 concurrent executions. If a burst of S3 events triggers 2,000 Lambda invocations simultaneously, half will throttle. The fix: reserve concurrency for critical functions, use SQS as a buffer between the event source and Lambda, or request a limit increase.
Fan-out patterns distribute data from one source to multiple consumers. SNS supports this natively ā a single SNS topic can deliver messages to SQS queues, Lambda functions, HTTP endpoints, and email simultaneously. For Kinesis, enhanced fan-out gives each consumer dedicated throughput. For S3, EventBridge routes a single object-created event to multiple targets.
Fan-in patterns aggregate data from multiple sources into one target. SQS queues naturally handle fan-in ā multiple producers write messages, and a single consumer (or consumer group) processes them. This is useful when dozens of microservices each generate events that need to be consolidated into a single data pipeline.
ā ļø Exam Trap: When a question describes intermittent pipeline failures with "throughput exceeded" errors, the answer is rarely "increase the service limit." The exam prefers architectural solutions: add a buffer (SQS between producer and consumer), improve data distribution (better partition keys), or use on-demand/auto-scaling modes. Throwing more capacity at the problem is usually the wrong answer.
Reflection Question: A Lambda function processes S3 event notifications and writes to DynamoDB. During peak hours, some writes fail with throttling errors. The DynamoDB table uses provisioned capacity. What are three approaches to fix this, and which does the exam prefer?