Docker Community Forums

Share and learn in the Docker community.

Is it possible to run python package with multiple .py scripts in docker-compose.yml?

Summarize the problem:

The Python package basically opens PDFs in batch folder, reads the first page of each PDF, matches keywords, and dumps compatible PDFs in source folder for OCR scripts to kick in. The first script to take all PDFs are MainBankClass.py. I am trying to use a docker-compose file to include all these python scripts under the same network and volume so that each OCR script starts to scan bank statements when the pre-processing is done. This link is the closest so far to accomplish the goal but it seems that I missed some parts of it. The process to call different OCR scripts is achieved by runpy.run_path(path_name='ChaseOCR.py'), thus these scripts are in the same directory of __init__.py. Here is the filesystem structure:

BankStatements
 ┣ BankofAmericaOCR
 ┃ ┣ BancAmericaOCR.py
 ┃ ┗ Dockerfile.bankofamerica
 ┣ ChaseBankStatementOCR
 ┃ ┣ ChaseOCR.py
 ┃ ┗ Dockerfile.chase
 ┣ WellsFargoStatementOCR
 ┃ ┣ Dockerfile.wellsfargo
 ┃ ┗ WellsFargoOCR.py
 ┣ BancAmericaOCR.py
 ┣ ChaseOCR.py
 ┣ Dockerfile
 ┣ WellsFargoOCR.py
 ┣ __init__.py
 ┗ docker-compose.yml

What I’ve tried so far:

In docker-compose.yml:

version: '3'

services:
    mainbankclass_container:
        build: 
            context: '.'
            dockerfile: Dockerfile
        volumes: 
            - /Users:/Users
        #links:
        #    - "chase_container"
        #    - "wellsfargo_container"
        #    - "bankofamerica_container"
    chase_container:
        build: .
        working_dir: /app/ChaseBankStatementOCR
        command: ./ChaseOCR.py
        volumes: 
            - /Users:/Users
    bankofamerica_container:
        build: .
        working_dir: /app/BankofAmericaOCR
        command: ./BancAmericaOCR.py
        volumes: 
            - /Users:/Users
    wellsfargo_container:
        build: .
        working_dir: /app/WellsFargoStatementOCR
        command: ./WellsFargoOCR.py
        volumes: 
            - /Users:/Users

And each dockerfile under each bank folder is similar except CMD would be changed accordingly. For example, in ChaseBankStatementOCR folder:

FROM python:3.7-stretch
WORKDIR /app
COPY . /app
CMD ["python3", "ChaseOCR.py"] <---- changes are made here for the other two bank scripts

The last element is for Dockerfile outside of each folder:

FROM python:3.7-stretch
WORKDIR /app
COPY ./requirements.txt ./ 
RUN pip3 install --upgrade pip
RUN pip3 install -r requirements.txt
RUN pip3 install --upgrade PyMuPDF

COPY . /app

COPY ./ChaseOCR.py /app
COPY ./BancAmericaOCR.py /app
COPY ./WellsFargoOCR.py /app

EXPOSE 8080

CMD ["python3", "MainBankClass.py"]

After running docker-compose build, containers and network are successfully built. Error occurs when I run docker run -v /Users:/Users: python3 python3 ~/BankStatementsDemoOCR/BankStatements/MainBankClass.py and the error message is FileNotFoundError: [Errno 2] No such file or directory: ‘BancAmericaOCR.py’

I am assuming that the container doesn’t have BancAmericaOCR.py but I have composed each .py file under the same network and I don’t think links is a good practice since docker recommended to use networks here. What am I missing here? Any help is much appreciated. Thanks in advance.

sorry, i accidently eddite the response instead of creating a new one.

Found at least parts of the original post in my email notifcation:
Your objective is unclear to me.

Why several images? Why don’t you implement a strategie pattern based on the matched keywords and use the matching strategy to process an incomming pdf? Anyway, If you declare a CMD in your Dockerfile every argument in a docker run command that follows the image:ta…

Thanks for your time and response. Basically what I want to do here is to initialize MainBankClass.py in docker with other .py files. In MainBankClass.py file, once PDFs coming in, the program will open the page and classify which bank it is and kick off the next ChaseOCR.py file if that’s a chase statement. Is it possible to do the entire process and package them into Docker like running it locally?

People run whole enterprise grade python applications in containers…

You are aware that a docker container is nothing else than an isolated process (which potentialy can create sub processes if needed) on your hosts’s kernel, aren’t you? A container (=such an isolated process) will only be allowed to see whatever Namespaces (for simplicity think of partitions in the kernel) it is allowed to use (docker handles this for your container), which ressources (ram/cpu/io) it is allowed to use (if you set resource limits to your containers) and which capabilities for privliged tasks it is allowed to use along with some fancy network magic… So basily this is the well guarded sibling of a local process that thinks it’s the owner of the world it is allowed to see.

Yes. I understand the concept of using containers and I think you are pointing me to the right direction since you’ve mentioned networks in Docker. This is what I want to achieve but it seems like each python script doesn’t communicate to each other properly in docker-compose file. Any issues you could point out based on what we have now to make it work?The purpose of packaging this locally running app in Docker is for the use of Azure Service Bus in later production stage. We are just in development and trying to incorporate Docker.

This doesn’t realy convince me that you do understand the concept. You seperated your logic into distinct containers. Thus each container is able to access whatever is in the container (files/processes), files located on volumes (if used) and is able to communicate with remote systems (or other containers) via api calls (rest endpoints?) or whatever network transport mechanisms you choose.

Just out of curiosity: shouldn’t your (solution)architect know how to actualy properly design and implement this sort of things?

What are you suggesting to do? I am all ears for your suggestions. Instead of separating different containers, should I put the entire program into 1 container under the same network?

FYI - we are just embracing DevOps under new leadership thus people here are wearing many hats.

I see, someone is eagerly practicing best practices from “how to successfully fail a project” ^^

What I recommended in my first post is what I would do: single application in a single container implementing a strategy pattern.

Generaly, I would highly recommend to take this self paced training to get a better understanding of docker. It provides better information than any 3 day docker onsite training I have participated in (I had two).

Good luck on your journey!

You could say that :slight_smile: Anyways, thanks for your valuable input and I will take a look at the site for sure.

Steps

  1.   Installing Docker
    

The first thing you need to do is install docker. I am working on an Ubuntu system and you can find instructions on how to install docker in Ubuntu here.
Once you have docker installed you can continue with the following steps.

  1.   Installing docker compose
    

Now we need to install docker compose, to do so run the following command to download compose

sudo curl -L “https://github.com/docker/compose/releases/download/1.23.2/docker-compose-(uname -s)-(uname -m)” -o /usr/local/bin/docker-compose
How to run a python app with docker compose-

  Now we need to apply executable permissions to the binary to allow us to run compose

sudo chmod +x /usr/local/bin/docker-compose
How to run a python app with docker compose-

Check the install by running the following command

docker-compose --version
How to run a python app with docker compose-

If you get a response similar to the one in the screenshot then you are good to go.