Behavior of `up --wait` changed?

blizz · April 15, 2025, 9:23am

Hello,

My deploy script starts all compose stacks with --wait and then checks the error code upon completion.

I have a number of stacks that have an init container (it itself depends on the database), and will run any migrations if there are any, and then exit 0.

So basically: database starts, init service starts, optionally triggers migrations and then exits, all other services that depend on the init service start, stack is up.

This never was an issue for --wait before, but now (I’m guessing since our quarterly server patching/update), it seems to result in an exit 1? (which causes my pipeline to fail)

Is this behavior new, is it on purpose?
if so, how can I exempt a service from this? (I would like to keep using wait as it gives my pipeline a bit more security that everything is running correctly).

Thank you.

rimelek · April 15, 2025, 9:42am

Can you share the output of

docker info

?
If you have any private info like private plugins, registry IPs or anything, you can remove those from the output.

Also do you mean ˙docker compose up --wait˙ returns the exit code 1? When and if it does, are your services all healthy? IS it possible that some services reached the wait timeout (--wait-timeout) ?

blizz · April 15, 2025, 9:50am

(docker info at the bottom)

The stack starts in about 10 seconds, I use a timeout of 120s, but even with the standard 60 it’s not coming anywhere near that.

Regular “up” (I’ve replaced the internal name by “application” for privacy purposes):

$ docker compose up -d
[+] Running 5/5
 ✔ Network application-dashboard_internal       Created                                                                              0.1s 
 ✔ Container application-dashboard-database-1   Started                                                                              2.6s 
 ✔ Container application-dashboard-migrate-1    Started                                                                              5.8s 
 ✔ Container application-dashboard-pgadmin-1    Started                                                                              5.2s 
 ✔ Container application-dashboard-dashboard-1  Started                                                                              9.0s 
$ echo $?
0

With --wait:

$ docker compose up -d --wait --wait-timeout 120
[+] Running 4/5
 ✔ Network application-dashboard_internal       Created                                                                              0.1s 
 ✔ Container application-dashboard-database-1   Healthy                                                                             10.1s 
 ⠸ Container application-dashboard-migrate-1    Waiting                                                                             10.1s 
 ✔ Container application-dashboard-pgadmin-1    Healthy                                                                             10.1s 
 ✔ Container application-dashboard-dashboard-1  Healthy                                                                             10.0s 
container application-dashboard-migrate-1 exited (0)
$ echo $?
1

But the other 3 services are running healthy after that.

docker info:

Client: Docker Engine - Community
 Version:    28.0.4
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.22.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.34.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose
  scan: Docker Scan (Docker Inc.)
    Version:  v0.23.0
    Path:     /usr/libexec/docker/cli-plugins/docker-scan

Server:
 Containers: 102
  Running: 81
  Paused: 0
  Stopped: 21
 Images: 433
 Server Version: 28.0.4
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: active
  NodeID: <snip>
  Is Manager: true
  ClusterID: <snip>
  Managers: 1
  Nodes: 1
  Default Address Pool: 10.0.0.0/8  
  SubnetSize: 24
  Data Path Port: 4789
  Orchestration:
   Task History Retention Limit: 5
  Raft:
   Snapshot Interval: 10000
   Number of Old Snapshots to Retain: 0
   Heartbeat Tick: 1
   Election Tick: 10
  Dispatcher:
   Heartbeat Period: 5 seconds
  CA Configuration:
   Expiry Duration: 3 months
   Force Rotate: 0
  Autolock Managers: false
  Root Rotation In Progress: false
  Node Address: 10.123.14.29
  Manager Addresses:
   10.123.14.29:2377
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05044ec0a9a75232cad458027ca83437aae3f4da
 runc version: v1.2.5-0-g59923ef
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 4.18.0-553.46.1.el8_10.x86_64
 Operating System: Red Hat Enterprise Linux 8.10 (Ootpa)
 OSType: linux
 Architecture: x86_64
 CPUs: 192
 Total Memory: 1006GiB
 Name: <snip>
 ID: <snip>
 Docker Root Dir: /data/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  ::1/128
  127.0.0.0/8
 Live Restore Enabled: false
 Default Address Pools:
   Base: 172.23.0.0/16, Size: 24

rimelek · April 16, 2025, 7:29pm

I have the same compose version in Docker Desktop and I could reproduce the issue, but only with a compose file that contains a service that exits after running even if the exit code is 0. I would not expect it, so I would say it is a bug, which could be reported on Github, but it was actually reported already in 2023

github.com/docker/compose

[BUG] compose up --wait exits 1 on init containers successfully completing

opened 04:06PM - 22 May 23 UTC

nmguse-bighealth

kind/feature area/cli area/up

### Description When using `docker compose up --wait`, where I have an init con…tainer that populates some data and then exits (0), docker compose will exit 1. If any other container depends on the init container finishing, it will exit properly, as described/exampled in this PR: https://github.com/docker/compose/pull/9572. In my situation however, there is no container that runs a service that depends on the init container finishing. ### Steps To Reproduce docker-compose.yml example: ``` version: '3' services: postgres: image: postgres:14-alpine environment: POSTGRES_DB: postgres POSTGRES_USER: postgres POSTGRES_PASSWORD: postgres healthcheck: test: ['CMD', 'pg_isready'] postgres_setup: image: alpine depends_on: postgres: condition: service_healthy restart: "no" command: pwd ``` Run docker compose/print exit code: ``` $ docker compose up --wait [+] Running 2/3 ✔ Network test_default Created 0.1s ✔ Container test-postgres-1 Healthy 31.9s ⠿ Container test-postgres_setup-1 Waiting 31.9s container test-postgres_setup-1 exited (0) $ echo $? 1 ``` ### Compose Version ```Text Docker Compose version v2.17.3 ``` ### Docker Environment ```Text Client: Context: default Debug Mode: false Plugins: buildx: Docker Buildx (Docker Inc.) Version: v0.10.4 Path: /Users/nathan/.docker/cli-plugins/docker-buildx compose: Docker Compose (Docker Inc.) Version: v2.17.3 Path: /Users/nathan/.docker/cli-plugins/docker-compose dev: Docker Dev Environments (Docker Inc.) Version: v0.1.0 Path: /Users/nathan/.docker/cli-plugins/docker-dev extension: Manages Docker extensions (Docker Inc.) Version: v0.2.19 Path: /Users/nathan/.docker/cli-plugins/docker-extension init: Creates Docker-related starter files for your project (Docker Inc.) Version: v0.1.0-beta.4 Path: /Users/nathan/.docker/cli-plugins/docker-init sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.) Version: 0.6.0 Path: /Users/nathan/.docker/cli-plugins/docker-sbom scan: Docker Scan (Docker Inc.) Version: v0.26.0 Path: /Users/nathan/.docker/cli-plugins/docker-scan scout: Command line tool for Docker Scout (Docker Inc.) Version: v0.10.0 Path: /Users/nathan/.docker/cli-plugins/docker-scout Server: Containers: 7 Running: 4 Paused: 0 Stopped: 3 Images: 69 Server Version: 23.0.5 Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Using metacopy: false Native Overlay Diff: true userxattr: false Logging Driver: json-file Cgroup Driver: cgroupfs Cgroup Version: 2 Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: io.containerd.runc.v2 runc Default Runtime: runc Init Binary: docker-init containerd version: 2806fc1057397dbaeefbea0e4e17bddfbd388f38 runc version: v1.1.5-0-gf19387a init version: de40ad0 Security Options: seccomp Profile: builtin cgroupns Kernel Version: 5.15.49-linuxkit Operating System: Docker Desktop OSType: linux Architecture: x86_64 CPUs: 4 Total Memory: 5.804GiB Name: docker-desktop ID: dc980d43-b287-4b8d-90b1-992be4c7b457 Docker Root Dir: /var/lib/docker Debug Mode: false HTTP Proxy: http.docker.internal:3128 HTTPS Proxy: http.docker.internal:3128 No Proxy: hubproxy.docker.internal Registry: https://index.docker.io/v1/ Experimental: false Insecure Registries: hubproxy.docker.internal:5555 127.0.0.0/8 Live Restore Enabled: false ``` ### Anything else? _No response_

You can join the discussion there.

As a workaround, someone recommended using the sleep infinity, but that would make the container run practically forever. So I would instead use a finite number of minutes or seconds bigger than the timeout, so it will only stop after docker compose already returned with an exit code 0.

update:

Example

services:
  nginx:
    image: bash
    init: true
    command:
      - sh
      - -c
      - 'ls -la && sleep 130'

blizz · April 17, 2025, 6:37am

The service exits indeed after the migrations are complete.

Thank you for helping to look into it and discovering the bug report!
I guess that highlights one of the many advantages of having to update your servers from a company provided mirror instead of the internet… you arrive late to the party, very late.
Didn’t even consider looking back that far.

Anyway the last reply provides a working workaround by using the explicit service_completed_successfully condition for the depending service to indicate it is allowed to terminate, so we’ll be using that from now on.

rimelek · April 17, 2025, 11:09am

You are right, I thought it was the init container service and ignored the condition

system · May 17, 2025, 11:10am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Docker-composer wait services General docker	0	644	October 12, 2017
The "docker compose up -d" command does not finish General	3	1357	November 18, 2023
Controlling starting up order in compose Docker Desktop beta	0	1108	August 18, 2016
Mage runs properly using docker run -dit, but exits using docker stack deploy -c General docker	2	2543	August 31, 2017
Compose schema version for depends_on condition to wait for successful service completion Compose docker	5	3071	December 8, 2021

Behavior of `up --wait` changed?

Related topics