Docker container started from inside another container "disappears"

Not really sure if “General Discussions” is the right place for this, but I don’t really see a Help forum, so feel free to move this to a more fitting place :slight_smile:

This is my situation:
I am trying to create a Jenkins setup that will utilize docker images to build various repositories.
For that purpose, Jenkins connects via ssh to a running docker container (which is on AWS, but I don’t think that is relevant?).
That container is set up to utilize the host’s docker (via volume mounting, -v /var/run/docker.sock:/var/run/docker.sock). Using docker this way works, I manually SSH’d to the container and was able to use the host’s docker normally from within the container to build & run various images.
The repository I am testing has a build hook, which notifies Jenkins on commit.
It also has a Jenkinsfile with a completely bare-bones script (it’s really just for testing Jenkins at this point):

pipeline {
    agent {
        docker {
            label 'base_agent_1'
            image 'lalalala.dkr.ecr.us-east-2.amazonaws.com/base:latest'
            registryUrl 'https://lalalala.dkr.ecr.us-east-2.amazonaws.com'
        }
    }

    stages {
        stage('STAGE: Build') {
            steps {
                sh 'docker-credential-ecr-login version'
            }
        }
        stage('STAGE: Run') {
            steps {
                sh 'docker-credential-ecr-login version'
            }
        }
        stage('STAGE: Test') {
            steps {
                sh 'docker-credential-ecr-login version'
            }
        }
        stage('STAGE: Push') {
            steps {
                sh 'docker-credential-ecr-login version'
            }
        }
    }

    post {
        always {
            echo 'The jenkins-build repository run is finished.'
        }
        changed {
            echo 'There is a different result than the last run.'
        }
        fixed {
            echo 'This run fixed problems from the previous one.'
        }
        regression {
            echo 'This run had regressions from the previous run.'
        }
        aborted {
            echo 'This run was aborted.'
        }
        failure {
            echo 'This run failed.'
        }
        success {
            echo 'This run succeeded.'
        }
    }
}

So far, so good. Jenkins is notified, the agent is started up on the container and it begins with the pipeline.

However, when it gets to the point of running the image specified in the agent, this error happens (copied from Jenkins log):

[Pipeline] withDockerContainer
jenkins_base_1 seems to be running inside container 444496341bc30a935dd0e9003f372b3f6ece84913aded876766920603db75ca2
$ docker run -t -d -u 1000:1000 -w /home/jenkins/workspace/jenkins-builds_main --volumes-from 444496341bc30a935dd0e9003f372b3f6ece84913aded876766920603db75ca2 -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** lalalala.dkr.ecr.us-east-2.amazonaws.com/base:latest cat
$ docker top 7265e124d152e8a0612421ca94d562bcecd913ca904480f12d88d071f2788553 -eo pid,comm
[Pipeline] // withDockerContainer
[Pipeline] }
[Pipeline] // withDockerRegistry
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] stage
[Pipeline] { (Declarative: Post Actions)
[Pipeline] echo
The jenkins-build repository run is finished.
[Pipeline] echo
This run failed.
[Pipeline] }
[Pipeline] // stage
[Pipeline] End of Pipeline

GitHub has been notified of this commit’s build result

java.io.IOException: Failed to run top ‘7265e124d152e8a0612421ca94d562bcecd913ca904480f12d88d071f2788553’. Error: Error response from daemon: Container 7265e124d152e8a0612421ca94d562bcecd913ca904480f12d88d071f2788553 is not running

I tried executing that docker run command manually from within the container - albeit without all those -e ****** as I have no clue what those even are supposed to do.
The command succeeds and returns the ID of the container so I assume it started successfully.
But when I then docker container ls, there is no container with that ID. Which would explain the error from the Jenkins log.

But… what is actually the problem here? Why does the container just disappear? The image has an entrypoint so it shouldn’t just quit.
It is in fact the very same image that the container running the command uses - the idea being that the container should be able to build & push newer versions of itself via Jenkins.

It turns out that the reason for this behavior was a line in the entrypoint script.
It caused an error, which made the script quit before it was able to get to the final command that would’ve kept it open forever.

And since the script quit, the container was officially done and thus no longer available for a docker top.

So if anyone has a similar problem:
Enter a terminal in the container trying to run the image and manually execute the docker run command detached (without the -d ). This will give you an output of what’s happening in your entrypoint and might give you an idea about what’s going wrong.