I have deployed one application in Docker swar Preformatted text which has the following architecture
It has 3 containers where container #1 is model inference container, it communicates with other containers and provides inference
All containers are Flask
+Gunicorn
application
Now my question is how we scale the containers because there 2 types of scaling here
- One by using Gunicorn worker scaling
- Other by using Docker swarm service scaling