3.3. Backup, Restore, and Disaster Recovery
š” First Principle: What happens when a production database is corrupted and the restore fails? That moment reveals whether your disaster recovery strategy is real or imagined. A backup you've never tested is a backup you don't have. Think of it like a fire drill: the drill itself matters because systems and people that have never run a restore under pressure will fail when it counts. Whereas snapshot creation is automated, restore testing requires deliberate effort. Disaster recovery is not just about creating backups ā it's about ensuring you can restore from them, within an acceptable time window, when everything is going wrong simultaneously. The exam tests not just which backup tools exist, but whether you know how to select the right strategy for a given RTO and RPO.
RTO (Recovery Time Objective) ā the maximum acceptable time to restore service after a failure. Lower RTO = more expensive (because you need more pre-deployed resources).
RPO (Recovery Point Objective) ā the maximum acceptable data loss, measured in time. "RPO of 1 hour" means you can tolerate losing up to 1 hour of data. Lower RPO = more frequent backups = higher cost and complexity.
The DR strategy you choose is always a cost/risk trade-off. The four strategies form a spectrum: