3.3.2. Automated Remediation & Fleet Management
Manual remediation doesn't scale. When you manage 5 servers, SSH-ing in to fix a configuration drift is feasible. When you manage 500 — or 5,000 — you need systems that detect problems and fix themselves. Think of fleet management like a thermostat for your entire infrastructure: it continuously measures the actual state, compares it to the desired state, and automatically corrects any drift. The cost of manual fleet management isn't just time; it's inconsistency — server 47 has a slightly different configuration than server 48 because someone patched one and forgot the other. These invisible inconsistencies surface as intermittent, maddening production issues.
This section covers AWS fleet management services (Systems Manager, Config) for maintaining consistent configurations, detecting drift, and automatically remediating non-compliant resources. How quickly would your team notice if 3 out of 500 instances silently drifted from their desired configuration?
