2.2.1. CloudWatch Logs, Log Groups, and Logs Insights
š” First Principle: A log is only useful if you can find the relevant entry within seconds, not hours. CloudWatch Logs is not just a storage destination ā it's a queryable analytics engine that can extract metrics from unstructured text in near real-time.
Log Hierarchy:
Log Groups are the container for logs from the same source. You set retention on the log group (1 day to 10 years, or never expire). If you don't set retention, logs accumulate indefinitely and you pay for storage forever.
Log Streams are sequences of events from the same source within a group. Each EC2 instance gets its own log stream; each Lambda invocation context may use its own stream.
Metric Filters transform log content into CloudWatch metrics. For example: scan every log line for the pattern [ERROR], count matches, and publish a custom metric ErrorCount every minute. This lets you alarm on log-level events without a separate log analysis tool.
CloudWatch Logs Insights is the query engine. It uses its own SQL-like syntax:
fields @timestamp, @message
| filter @message like /ERROR/
| stats count(*) as error_count by bin(5m)
| sort @timestamp desc
| limit 20
Key Insights commands:
fieldsā select specific log fields to displayfilterā filter events (like WHERE in SQL)statsā aggregate (count, sum, avg, min, max)sortā order resultsparseā extract structured data from unstructured text using regex or glob patternslimitā restrict result count
Subscription Filters stream log data in near real-time to other services:
| Destination | Use Case |
|---|---|
| AWS Lambda | Real-time processing, alerts, enrichment |
| Amazon Kinesis Data Streams | High-volume streaming analytics |
| Amazon Data Firehose | Delivery to S3, Redshift, OpenSearch |
ā ļø Exam Trap: CloudWatch Logs Insights queries are run on-demand ā they're not continuous. If you need real-time streaming analysis, use a subscription filter with a destination. Logs Insights is for investigation and ad-hoc queries, not continuous monitoring.
Reflection Question: A security team wants to be alerted within 60 seconds whenever a specific error string appears in application logs. What combination of CloudWatch features would you configure?