Amazon CloudWatch: Metrics, Logs, and Alarms

Tags:

April 25, 2026

When operating systems on AWS, monitoring the state of your resources and detecting anomalies early is essential. Amazon CloudWatch is a managed service that centralizes monitoring and observability (the ability to understand a system's internal state from the outside) for AWS resources and applications. This article covers CloudWatch's core concepts, how metrics, logs, and alarms work, and basic operations using the AWS CLI.

What is Amazon CloudWatch

CloudWatch is a fully managed monitoring and observability service from AWS. It automatically collects metrics from AWS resources such as EC2 instances, RDS, and Lambda, and provides a wide range of monitoring capabilities including dashboards, notifications, and log management.

With CloudWatch, you can monitor your system's health in real time, receive alarm notifications when issues occur, and analyze logs to identify root causes.

What is Amazon CloudWatch? - Amazon CloudWatch

Monitor your AWS resources and applications using Amazon CloudWatch to collect and track metrics on ...

docs.aws.amazon.com

Key Features

CloudWatch is made up of four main capabilities.

Feature	Overview
Metrics	Collects and displays performance data such as CPU usage and network throughput as a time series
Logs	Collects, stores, and searches logs output by applications and AWS services
Alarms	Triggers notifications or automated actions when a metric exceeds a threshold
Dashboards	Visualizes multiple metrics and logs on a single screen

By combining these features, you can build comprehensive monitoring across both infrastructure and application layers.

Metrics

Metrics are the core of CloudWatch. They store performance data from AWS services as time series and visualize it as graphs.

AWS services like EC2 and RDS automatically send standard metrics. If you want to send your own application-specific data, you can publish custom metrics with any value to CloudWatch.

Metrics are grouped by namespace — AWS service metrics fall under namespaces like AWS/EC2 or AWS/RDS. Each metric is identified by key-value pairs called dimensions. For example, EC2 metrics use the instance ID as a dimension.

Item	Details
Resolution	AWS services default to 5-minute intervals (1-minute with detailed monitoring enabled, 1-second for high-resolution metrics)
Retention	Stored for 3 hours to 15 months depending on resolution
Free tier	Up to 10 custom metrics and 1 million API requests per month at no charge

Metrics in Amazon CloudWatch - Amazon CloudWatch

View, graph, and publish data about the performance of your systems.

docs.aws.amazon.com

CloudWatch Logs

CloudWatch Logs is a feature for centralizing the management of logs output by applications and AWS services.

Logs are organized into units called log groups, each containing multiple log streams. For example, Lambda functions create one log group per function, and a new log stream is added for each invocation.

To search and analyze logs, you use CloudWatch Logs Insights — a query feature with SQL-like syntax for filtering and aggregating log data. For instance, you can aggregate error log counts over time or filter logs by a specific request ID.

What is Amazon CloudWatch Logs? - Amazon CloudWatch Logs

Describes the fundamentals, concepts, and terminology you need to know for using CloudWatch Logs to ...

docs.aws.amazon.com

Alarms

Alarms continuously monitor metric values and automatically trigger notifications or actions when a configured condition is met.

When configuring an alarm, the two most important settings are the threshold and the evaluation period. The threshold is the boundary value at which the alarm fires, and the evaluation period specifies how many consecutive times the threshold must be exceeded before the alarm state is triggered. If you don't want to fire on temporary spikes, set the evaluation period to multiple consecutive readings.

An alarm has three states and transitions from OK to ALARM when its condition is met.

State	Meaning
OK	The metric is within the configured threshold
ALARM	The metric has exceeded the threshold and the alarm condition is met
INSUFFICIENT_DATA	Not enough data to determine the alarm state

When an alarm enters the ALARM state, you can send notifications to email or Slack via Amazon SNS topics, or trigger actions such as stopping or restarting an EC2 instance.

Using Amazon CloudWatch alarms - Amazon CloudWatch

Create a CloudWatch alarm that sends an Amazon SNS message or performs an action when the alarm chan...

docs.aws.amazon.com

Trying CloudWatch with the AWS CLI

Let's use the AWS CLI to explore metrics and practice creating and deleting alarms.

Listing Metrics

List CPU utilization metrics for an EC2 instance. Use --namespace to specify the target namespace and --metric-name to specify the metric name.

❯ aws cloudwatch list-metrics \
    --namespace AWS/EC2 \
    --metric-name CPUUtilization
{
    "Metrics": [
        {
            "Namespace": "AWS/EC2",
            "MetricName": "CPUUtilization",
            "Dimensions": [
                {
                    "Name": "InstanceId",
                    "Value": "i-0a1b2c3d4e5f67890"
                }
            ]
        }
    ]
}

Getting Metric Statistics

Fetch the average CPU utilization for the past hour in 5-minute intervals. Use --start-time and --end-time to define the time range, and --period to set the aggregation interval in seconds.

❯ aws cloudwatch get-metric-statistics \
    --namespace AWS/EC2 \
    --metric-name CPUUtilization \
    --dimensions Name=InstanceId,Value=i-0a1b2c3d4e5f67890 \
    --start-time 2026-04-25T01:30:00Z \
    --end-time 2026-04-25T02:30:00Z \
    --period 300 \
    --statistics Average
{
    "Label": "CPUUtilization",
    "Datapoints": [
        {
            "Timestamp": "2026-04-25T11:00:00+09:00",
            "Average": 0.2966666666666667,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2026-04-25T10:55:00+09:00",
            "Average": 0.2502405052421769,
            "Unit": "Percent"
        }
    ]
}

Creating an Alarm

Configure an alarm to trigger when CPU utilization reaches 80%. By specifying an SNS topic ARN with --alarm-actions, you can receive notifications.

❯ aws cloudwatch put-metric-alarm \
    --alarm-name high-cpu-alarm \
    --metric-name CPUUtilization \
    --namespace AWS/EC2 \
    --dimensions Name=InstanceId,Value=i-0a1b2c3d4e5f67890 \
    --statistic Average \
    --period 300 \
    --evaluation-periods 2 \
    --threshold 80 \
    --comparison-operator GreaterThanOrEqualToThreshold \
    --alarm-actions arn:aws:sns:us-east-1:123456789012:exrecord-topic

Checking Alarm State

Check the current state of the alarm you created. A StateValue of OK means the metric is within normal range. ALARM means the threshold has been exceeded.

Output

❯ aws cloudwatch describe-alarms --alarm-names high-cpu-alarm
{
    "MetricAlarms": [
        {
            "AlarmName": "high-cpu-alarm",
            "AlarmArn": "arn:aws:cloudwatch:us-east-1:123456789012:alarm:high-cpu-alarm",
            "AlarmConfigurationUpdatedTimestamp": "2026-04-25T11:13:07.691000+09:00",
            "ActionsEnabled": true,
            "OKActions": [],
            "AlarmActions": [
                "arn:aws:sns:us-east-1:123456789012:exrecord-topic"
            ],
            "InsufficientDataActions": [],
            "StateValue": "OK",
            "StateReason": "Threshold Crossed: 2 datapoints [0.3124812407599757 (25/04/26 02:09:00), 0.3033163083601058 (25/04/26 02:04:00)] were not greater than or equal to the threshold (80.0).",
            "StateReasonData": "{\"version\":\"1.0\",\"queryDate\":\"2026-04-25T02:14:13.262+0000\",\"startDate\":\"2026-04-25T02:04:00.000+0000\",\"statistic\":\"Average\",\"period\":300,\"recentDatapoints\":[0.3033163083601058,0.3124812407599757],\"threshold\":80.0,\"evaluatedDatapoints\":[{\"timestamp\":\"2026-04-25T02:09:00.000+0000\",\"sampleCount\":4.0,\"value\":0.3124812407599757}]}",
            "StateUpdatedTimestamp": "2026-04-25T11:14:13.263000+09:00",
            "MetricName": "CPUUtilization",
            "Namespace": "AWS/EC2",
            "Statistic": "Average",
            "Dimensions": [
                {
                    "Name": "InstanceId",
                    "Value": "i-0a1b2c3d4e5f67890"
                }
            ],
            "Period": 300,
            "EvaluationPeriods": 2,
            "Threshold": 80.0,
            "ComparisonOperator": "GreaterThanOrEqualToThreshold",
            "StateTransitionedTimestamp": "2026-04-25T11:14:13.263000+09:00"
        }
    ],
    "CompositeAlarms": []
}

Deleting an Alarm

Delete an alarm that's no longer needed. Deleting an alarm does not affect the underlying metric data — it remains intact.

❯ aws cloudwatch delete-alarms --alarm-names high-cpu-alarm

Summary

Amazon CloudWatch is a managed service that handles monitoring and observability for both AWS resources and applications
Metrics let you collect and visualize performance data like CPU and network usage as time series
CloudWatch Logs centralizes log management for applications and AWS services, with Logs Insights for query-based analysis
Alarms monitor metric thresholds and automatically execute actions like SNS notifications or EC2 operations when conditions are met

What is Amazon CloudWatch​

Key Features​

Metrics​

CloudWatch Logs​

Alarms​

Trying CloudWatch with the AWS CLI​

Listing Metrics​

Getting Metric Statistics​

Creating an Alarm​

Checking Alarm State​

Deleting an Alarm​

Summary​

What is Amazon CloudWatch

Key Features

Metrics

CloudWatch Logs

Alarms

Trying CloudWatch with the AWS CLI

Listing Metrics

Getting Metric Statistics

Creating an Alarm

Checking Alarm State

Deleting an Alarm

Summary