Im using cadvisor to collect all containers metrics. deployed like a global service, so 1 replica in each host.
Prometheus is pooling this cadvisor service in each node and after that you can pool prometheus:
This is the very basic configuration lines to pool cadvisor from prometheus:
- job_name: ‘cadvisor’
- names: [‘tasks.cadvisor’]
Prometheus is also a service, and we are using the internal docker dns resolver to pool the service “cadvisor” in the exposed port.
You can pool prometheus with a simple curl or whatever you want, this query is the consumed total cpu consumed by a service (all replicas of the service across the nodes in cluster).
Or you can create your custom query if you need scale based on memory or whatever
After that with this cpu usage per service metric you can decide if you need scale the service or not.
If you need to scale, you can get the number of running replicas:
“# REPLICAS =$( docker service ps SERVICE_NAME | grep Running| wc -l)”
And launch a command:
"# docker service update --replicas $REPLICAS + 1 "
You can deploy Prometheus like a docker service also, and it shouldn’t take so much time writting a custom script which pool prometheus and the docker manager to get the data and took a decision.