3.3.3.2. AWS Service Health Services (AWS Health, CloudWatch, Systems Manager OpsCenter)

3.3.3.2. AWS Service Health Services (AWS Health, CloudWatch, Systems Manager)

Before troubleshooting your application, verify that the underlying AWS services are healthy. Many "application failures" are actually AWS service disruptions.

AWS Health Dashboard:

Service Health Dashboard: Public status of all AWS services across all regions
Personal Health Dashboard (PHD): Events specific to your account (scheduled maintenance, service issues affecting your resources, abuse notifications)

Health event types:

issue: Ongoing AWS service problem affecting your resources
accountNotification: Account-specific information (maintenance windows)
scheduledChange: Upcoming maintenance that may require action

Using Health events for incident triage:

# Check if AWS is having issues before investigating your app
def check_aws_health(service, region):
    health = boto3.client('health', region_name='us-east-1')  # Health API is global
    events = health.describe_events(filter={
        'services': [service],
        'regions': [region],
        'eventStatusCodes': ['open', 'upcoming']
    })
    return events['events']

SSM OpsCenter aggregates operational issues (OpsItems) from multiple sources into a single dashboard:

CloudWatch alarms create OpsItems automatically
Config non-compliance creates OpsItems
Custom OpsItems from your automation
Each OpsItem links to related resources, runbooks, and remediation actions

Exam Trap: AWS Health API is only available in us-east-1 (even for events in other regions). If your automation runs in eu-west-1 and tries to call the Health API locally, it fails. Always configure the Health client with region_name='us-east-1'. For organizational-level health events, you must use the management account or a delegated administrator.

Written byAlvin Varughese•Founder•15 professional certifications