Docker CE version - usually latest or close to latest, currently 18.09.3
I’ve been dreading moving to K8s as it’s a big project and will take time we don’t have right now. Overall, docker swarm works well for me for the most part.
One issue - we have 6x workers and 3x managers - and it seems that over time things F out. Worker nodes would be stuck with 100M memory (all nodes 4x16) and be borderline inaccessible, eventually show as “down”.
The nodes have nothing running besides some corporate monitoring software and Docker CE (private / internal VM’s - not public cloud).
Only way to restore stuff is to restart the node - is it docker swarm managing memory badly stopping and starting containers all the time? It’s a relatively high-load environment with ~13 dotnet core apps running in the cluster, lots of traffic, current lots of stop/starts as new versions are continuously deployed.
I’ve read stuff about bad OOM management, etc etc.
What is the recommended setup here or am I doing something wrong?