Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

3.4.2. Data Retention, Archiving, and Deletion Strategies

šŸ’” First Principle: Retention policies define how long data must be kept (for compliance, business, or operational needs), and deletion policies define when and how data is removed. Getting this wrong in either direction — deleting too early or keeping too long — creates legal risk. Automation ensures policies are enforced consistently regardless of whether a human remembers to clean up.

S3 Versioning and Object Lock. Versioning preserves every version of every object, protecting against accidental deletion. Object Lock adds WORM (Write Once Read Many) protection — objects cannot be deleted or overwritten for a specified retention period. Two modes: Governance (users with special permissions can override) and Compliance (no one can delete, not even the root account). Legal Hold is an additional flag that prevents deletion indefinitely, independent of retention periods.

DynamoDB TTL (Time to Live). Automatically deletes items from a DynamoDB table when their TTL timestamp expires. DynamoDB handles the deletion in the background with no cost for the delete operations. Use cases: expiring session data, removing stale cache entries, and cleaning up temporary records. Important: TTL deletions are eventually consistent — items may persist for up to 48 hours past the TTL timestamp.

Redshift data management. VACUUM reclaims space from deleted or updated rows (Redshift uses copy-on-write). ANALYZE updates table statistics for the query optimizer. Both are essential maintenance operations that affect query performance. Redshift also supports UNLOAD to export data to S3, enabling archival of historical data that no longer needs warehouse-level query performance.

āš ļø Exam Trap: DynamoDB TTL does not guarantee immediate deletion — expired items may still appear in queries for up to 48 hours. If a question requires items to be removed immediately at expiration, TTL alone is insufficient. You'd need application-level filtering (exclude items where TTL < current_time) combined with TTL for eventual cleanup.

Reflection Question: A financial services company must retain transaction records for 7 years with the ability to restore any record within 4 hours. After 7 years, records must be permanently and irrecoverably deleted. What S3 features implement each requirement?

Alvin Varughese
Written byAlvin Varughese
Founder•15 professional certifications