Skip to content

Fix rollout failing when existing containers are in exited state#62

Open
Fikralaksana wants to merge 3 commits into
wowu:mainfrom
Fikralaksana:patch-1
Open

Fix rollout failing when existing containers are in exited state#62
Fikralaksana wants to merge 3 commits into
wowu:mainfrom
Fikralaksana:patch-1

Conversation

@Fikralaksana

Copy link
Copy Markdown

Summary
Fixes #20

When a service has containers in an exited state, docker rollout would attempt to restart the old (failed) container due to the --no-recreate flag, instead of creating a new container with the updated compose configuration.

This PR adds container state detection before starting services:

  • Fetches container state via docker compose ps -a --format json
  • If no containers exist at all, behaves as before (starts with --no-recreate)
  • If existing containers are in exited state, runs up --detach without --no-recreate, allowing Docker to recreate the container with the updated config
  • If containers are running normally, continues with the original --no-recreate behavior

Problem

  1. Deploy v1 of a service — it crashes and enters Exited state
  2. Fix the issue in docker-compose.yml (e.g., remove bad entrypoint)
  3. Run docker rollout — it restarts the old broken container instead of creating a new one with the fix

@wowu wowu left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

All changes to docker rollout must work in POSIX sh, work with docker compose v1 and v2, and don't expect any special tools like jq to be installed on the machine.

Comment thread docker-rollout Outdated
fikrapdso and others added 2 commits May 19, 2026 17:13
Use `docker compose ps --quiet` (running) vs `ps -a --quiet` (all) to
detect exited containers without parsing JSON. Drops the jq dependency
to keep the script portable to plain POSIX sh, per maintainer feedback.

Also removes the `==` bashism and the `exit 0` inside the piped
`while read` (which only exited the subshell, never the script).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Diff `ps -a --quiet` against `ps --quiet` to identify the specific
stopped container IDs, then `docker rm` only those and refill with
`compose up --detach --no-recreate`. The running replicas are left
untouched, preserving the zero-downtime contract when config has
drifted (the common case when rollout is invoked).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Fikralaksana Fikralaksana requested a review from wowu June 3, 2026 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scaling not working when old instance have status Exited

3 participants