Why is my swarm scheduler not working?

Your docker-compose.yaml file is as follows::

version: '3.9'
services:
  portainer:
    image: portainer/portainer-ce:2.9.3
    ports:
      - "8999:9000"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /home/deploy/docker/portainer_data:/data
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints: [node.role == manager]
    networks:
      - ccm_net 
  com-gateway:
    image: "registry.cn-shenzhen.aliyuncs.com/cloud_com/com-gateway:2024020902"
    environment:
      TZ: Asia/Shanghai
    volumes:
      - /home/deploy/docker/server/logs/com-gateway/:/logs/com-gateway
    networks:
      - ccm_net
    deploy:
      replicas: 2
      restart_policy:
        condition: on-failure
  com-auth:
    image: "registry.cn-shenzhen.aliyuncs.com/cloud_com/com-auth:2024020902"
    environment:
      TZ: Asia/Shanghai
    volumes:
      - /home/deploy/docker/server/logs/com-auth/:/logs/com-auth/
    networks:
      - ccm_net
    deploy:
      replicas: 1
      restart_policy:
        condition: on-failure
  com-core:
    image: "registry.cn-shenzhen.aliyuncs.com/cloud_com/com-core:2024020902"
    environment:
      TZ: Asia/Shanghai
    volumes:
      - /home/deploy/docker/server/logs/com-core/:/logs/com-core/
    networks:
      - ccm_net
    deploy:
      replicas: 2     
  com-adx:
    image: "registry.cn-shenzhen.aliyuncs.com/cloud_com/com-adx:2024020902"
    environment:
      TZ: Asia/Shanghai
    volumes:
      - /home/deploy/docker/server/logs/com-adx/:/logs/com-adx/
    networks:
      - ccm_net
    deploy:
      replicas: 3   
  com-adx-callback:
    image: "registry.cn-shenzhen.aliyuncs.com/cloud_com/com-adx-callback:2024020902"
    environment:
      TZ: Asia/Shanghai
    volumes:
      - /home/deploy/docker/server/logs/com-adx-callback/:/logs/com-adx-callback/
    networks:
      - ccm_net
    deploy:
      replicas: 2
  com-job:
    image: "registry.cn-shenzhen.aliyuncs.com/cloud_com/com-job:2024020902"
    environment:
      TZ: Asia/Shanghai
    volumes:
      - /home/deploy/docker/server/logs/com-job:/logs/
    networks:
      - ccm_net
    deploy:
      replicas: 1
networks:
  ccm_net:
    driver: overlay
    ipam:
      config:
        - subnet: 192.168.50.0/24

When I use the command docker deploy-c docker-compose.yaml ccm --with-registry-auth , all Docker services will automatically deploy to the manager node. The node status is Ready.
node information

ID                            HOSTNAME                  STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
2gsvns4qjj8t2v0y4m4spwhlo *   iZwz99zhkxxl5cgfthbm3zZ   Ready     Active         Leader           26.1.1
3s9ly4etmbmbuwpu7cavd17a5     iZwz99zhkxxl5cgfthbm40Z   Ready     Active                          26.1.1

This can’t be the correct command. Please share the exact command.

Furthermore, please also share the output of docker info.

I just made a mistake, the correct command should be
docker stack deploy -c docker-compose.yml ccm --with-registry-auth

docker info
ps: I have re established the cluster, and the node id may be different

Client: Docker Engine - Community
 Version:    26.1.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.27.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 12
 Server Version: 26.1.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: active
  NodeID: u2558lf20v2gbn8yejaz37uxi
  Is Manager: true
  ClusterID: ys9xydvguii19a23dr1z63cnr
  Managers: 1
  Nodes: 2
  Data Path Port: 4789
  Orchestration:
   Task History Retention Limit: 5
  Raft:
   Snapshot Interval: 10000
   Number of Old Snapshots to Retain: 0
   Heartbeat Tick: 1
   Election Tick: 10
  Dispatcher:
   Heartbeat Period: 5 seconds
  CA Configuration:
   Expiry Duration: 3 months
   Force Rotate: 0
  Autolock Managers: false
  Root Rotation In Progress: false
  Node Address: 172.20.206.172
  Manager Addresses:
   172.20.206.172:2377
 Runtimes: runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: e377cd56a71523140ca6ae87e30244719194a521
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 3.10.0-1160.108.1.el7.x86_64
 Operating System: CentOS Linux 7 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15GiB
 Name: iZwz99zhkxxl5cgfthbm3zZ
 ID: 06af0d32-3a54-4a7b-a299-4771d4417303
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  https://ikk9azq1.mirror.aliyuncs.com/
 Live Restore Enabled: false

Your compose file and the output of docker info looks good to me.

I can only assume that the problem is caused by the binds. In order to run swarm tasks, the host folder must exist on each node, the task should be deployed to.

Usually, we use named volumes backed by a remote share (nfsv4 recommended!) so that every swarm task can access the same state.

What is the output of docker stack ps ccm --no-trunc?

I was right with my assumption.

I am not sure what this means…

Do I need to create the mount path specified in the YAML file on the worker node?

If you don’t care that a scheduled task on one node will have a different state than when it’s scheduled on another node: yes.

If the scheduled task should work with the same state, regardless on which node it is scheduled, then you need to use a remote share accessible by both nodes.

You can either mount the remote share on each node into the same host path, and continue to use binds (not recommended at all). Or you start using named volumes backed by a remote share (recommended if used with nfsv4).

1 Like

Docker Swarm will not create directories on host/node for you. Regular Docker does.

So when using bind mounts with Docker Swarm, you need to create the directories first on all used nodes.

1 Like

Thank you. Your response helped me solve the problem.

“Thank you. The problem was just as you described, and I have fixed it.”