Before I begin explaining my problem, I wanted to let you know that I am a new user of docker and have only recently started working on it. So please bare with me if I sound a bit naive.
So currently I have a python script running within a docker container. The script queries data from a DB and then imports the info into a pandas dataframe and then writes out a csv file to a folder within an nfs mount. The nfs share is mounted during runtime.
fc8tdtsr@fc8tdbitmapconvs08]$ uname -r
fc8tdtsr@fc8tdbitmapconvs08]$ docker --version
Docker version 18.03.1-ce, build 9ee9f40
fc8tdtsr@fc8tdbitmapconvs08]$ docker info
Server Version: 18.03.1-ce
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489c9c1b21a6d78b5c538cd395416ec50f88
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Kernel Version: 3.10.0-957.5.1.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.6 (Maipo)
Total Memory: 1.968TiB
Docker Root Dir: /local/docker-data-root
Debug Mode (client): false
Debug Mode (server): false
HTTPS Proxy: uswwwp1.gfoundries.com:74
Live Restore Enabled: false
Docker Command that I run:
docker run -d --restart=always --volume-driver=nfs -v /td-bmp:/td-bmp:rw
–memory=128g --memory-reservation=32g --cpu-shares=28
–name=worker-1 daas-worker:latest --spark-name daas1
The csv generation takes a LONG time when I try to run the script within the docker container as compared to running it on the host.
For example in-order to generate a 5GB csv file the host takes an avg time of 30 mins (including querying the db and writing out the csv file). Whereas if I run the same scenario within the container, it takes almost 1.5 hrs to generate the same results. That is an hour more than the host.
From what I understand, the difference shouldn’t be that huge. I mean I do understand that there will be some trade offs but this 1 hr sounds real bad. Am I doing something wrong?
Please do let me know if you need anything else from me.