Dockerizing Multiple interdependent Python Scripts

How should I dockerize interdependent python scripts.
Say there is a function in script 1 which calls script 2 and script 3 in the process and then produces an output.

Even when I have copied all the scripts inside the docker, it searches for the script2 and script3 outside the docker.

Is there an example for this particular problem?

What is it doing exactly? Any process running inside a container will only be able to see things inside the container.

Do you have an example github repo that someone could clone and run the same commands you are running to reproduce your problem?

https://github.com/bharath-cchmc/edited-Super-Enhancer

This is link of my Github repo.

The process is: Firstly I call the python script ROSE_Main (Main Program). This program calls another file ROSE_bamTOGFF.

I wrap all these scripts into a docker container into a directory https://hub.docker.com/r/bharath90/superenhancer.

Testing this docker produces an error, saying file ROSE_bamToGFF is not found. Which I guess is because, the docker is searching for the file outside the container.

I’m not able to reproduce this error:

$ docker run --rm -it bharath90/superenhancer python ROSE_main.py
hi there
Usage: ROSE_main.py [options] -g [INPUT_GENOME] -i [INPUT_REGION_GFF] -r [RANKBY_BAM_FILE] -o [OUTPUT_FOLDER] [OPTIONAL_FLAGS]

Options:
  -h, --help            show this help message and exit
  -i INPUT, --i=INPUT   Enter a .gff or .bed file of binding sites used to
                        make enhancers
  -r RANKBY, --rankby=RANKBY
                        bamfile to rank enhancer by
  -o OUT, --out=OUT     Enter an output folder
  -g GENOME, --genome=GENOME
                        Reference genome file: example- hg18_refseq.ucsc
  -b BAMS, --bams=BAMS  Enter a comma separated list of additional bam files
                        to map to
  -c CONTROL, --control=CONTROL
                        bamfile to rank enhancer by
  -s STITCH, --stitch=STITCH
                        Enter a max linking distance for stitching
  -t TSS, --tss=TSS     Enter a distance from TSS to exclude. 0 = no TSS
                        exclusion

The python program certainly is not reaching outside the container-- normally you can just limit your scope to what’s in the container and treat an application error just like you would if you had it running in any other non-docker environment.

This is an error message if the script stops inbetween. But when I provide the correct arguments, it runs throughout until it reaches the place where it calls ROSE_bamtoGFF.py and then it throws an error, ‘file not found’ If you could provide me with any example that has a similar scenario, it would be helpful for me.

right-- what are the correct arguments?

A file not found could simply mean that the script is looking in the wrong directory.

http://younglab.wi.mit.edu/super_enhancer_code.html

The details and example data that could be downloaded to verify the script is here.

That looks like general usage information. I’m specifically interested in the inputs that you are providing so I can try to reproduce the exact same error you are getting. What command are you running?

1 Like

I am actually using docker inside another language called Common Workflow Language. So, there is a possibility that you would tell that it is the error of that tool.

Could you please provide me an example code from which I can see and learn?

Like a script calling another script inside the docker container? I apologize if I am annoying

Sure:

one.sh:

#!/bin/sh
echo one
/bin/sh ./two.sh

two.sh;

#!/bin/sh
echo two

Dockerfile:

FROM alpine
ADD one.sh /one.sh
ADD two.sh /two.sh
RUN chmod +x one.sh two.sh
WORKDIR /
CMD /one.sh

Build it this way:

docker build -t onetwo .

and run it this way:

docker run --rm -it onetwo

and the output should be:

one
two
1 Like

This part helped me.
I had to include all the files in here

RUN chmod +x main program subscript1 subscript2 etc…

Thanks a lot for your help

1 Like