How to monitor Docker Swarm cluster and service replicas (tasks) with Prometheus and Grafana?

Hi everyone,

I have a Docker Swarm cluster running multiple services, and I’d like to monitor both the cluster nodes and the individual service replicas (tasks) using Prometheus and visualize the metrics in Grafana.

Specifically, I want to track:

  • CPU, RAM, and disk usage for each Swarm node

  • CPU and memory usage for each Swarm service (including per-task/replica metrics)

I’ve already set up Prometheus and Grafana, but I’m not sure what’s the best approach to collect metrics from:

  1. The Swarm manager and worker nodes

  2. Each container/task belonging to a service

You probably need to run Node exporter and cAdvisor to provide the metrics for Prometheus.

is there dashboard to monitor CPU and memory usage for each Swarm service (including per-task/replica metrics)

Personally I haven’t tested it. There are also metrics directly from Docker (doc)