Monitoring and Observability
Overview
This guide shows you how to set up monitoring and observability for your Aztec node using OpenTelemetry, Prometheus, and Grafana. Monitoring helps you maintain healthy node operations, diagnose issues quickly, and track performance over time.
This monitoring setup is designed to work with Docker Compose deployments of Aztec nodes.
Architecture
The monitoring stack uses three components working together:
- OpenTelemetry Collector: Receives metrics from your Aztec node via OTLP protocol
- Prometheus: Stores and queries time-series metrics data
- Grafana: Visualizes metrics with dashboards and alerts
Your Aztec node exports metrics to the OpenTelemetry Collector, which processes and exposes them in a format Prometheus can scrape. Prometheus stores the metrics as time-series data, and Grafana queries Prometheus to create visualizations and alerts.
Getting Started
Follow these guides in order to set up your complete monitoring stack:
- OpenTelemetry Collector Setup - Configure OTEL to receive metrics from your node
- Prometheus Setup - Set up Prometheus to store and query metrics
- Grafana Setup - Configure Grafana for visualization and alerting
- Key Metrics Reference - Understand the metrics your node exposes and create custom dashboards
- Complete Example and Troubleshooting - Full Docker Compose configuration and troubleshooting help
Available Metrics Overview
Your Aztec node exposes metrics through OpenTelemetry to help you monitor performance and health. The metrics available depend on your node type (full node, sequencer, or prover) and version.
Metric Categories
Your node exposes metrics in these categories:
- Node Metrics: Block height, sync status, peer count, and transaction processing
- Sequencer Metrics: Attestation activity, block proposals, and committee participation (sequencer nodes only)
- Prover Metrics: Job queue, proof generation, and agent utilization (prover nodes only)
- System Metrics: CPU, memory, disk I/O, and network bandwidth
For detailed information about each metric, PromQL queries, and dashboard creation, see the Key Metrics Reference.
Next Steps
Once your monitoring stack is running:
- Review the Key Metrics Reference to understand available metrics and PromQL queries
- Set up alerting rules in Prometheus for critical conditions
- Create custom dashboards tailored to your operational needs
- Configure notification channels (Slack, PagerDuty, email) in Grafana
- Join the Aztec Discord to share dashboards with the community
For troubleshooting common monitoring issues, see the Complete Example and Troubleshooting guide.