Docker Swarm scaling vs Gunicorn scaling

I have deployed one application in Docker swar Preformatted text which has the following architecture

Screenshot from 2021-11-26 14-27-08

It has 3 containers where container #1 is model inference container, it communicates with other containers and provides inference

All containers are Flask+Gunicorn application

Now my question is how we scale the containers because there 2 types of scaling here

  • One by using Gunicorn worker scaling
  • Other by using Docker swarm service scaling