Currently I have 2 containers. The main container is an R Shiny application that sends a Post request to the second container, which is a Python Flask API container. The Python container loads some inputs, performs image preprocessing, runs an .onnx image classification model and writes a prediction .csv to a mounted volumes folder.
When running the process on ± 2000 images, the process works fine and completes in under a minute. When running it on ± 130,000 images, the container exits with code 247 indicating that there’s a memory issue in Docker. This happens during the image preprocessing stage of the process.
What I’ve done so far:
Increased the Memory resource to 14GB
Revamped the image preprocessing implementation multiple times trying batching (and different batch sizes), generators and using mmap to access images.
Currently I’m just using glob to create a generator with the image file paths and then feeding the file paths into the preprocessing function
When I run the process on the large image folder (~130,000) the Docker stats show that the container idles around 12GB, it then goes down to around 0GB
before exiting with code 247.
Are there any major changes that I can try implement that might help with memory usage? Docker settings, python code etc.
Any help would be appreciated thanks.
Docker desktop version: 4.25.2, Docker image Python version: 3.8, PC: MacOS Senoma 14.1.1