How to make python subprocess.run work in docker container?

Docker version 19.03.2, build 6a30dfc
Windows 10
This works locally but not in a Docker container. I’m trying to run pdftotext in a docker container then unit-testing it on a pdf file. I’m not sure if I’m misunderstanding the arguments required for subprocess.run . Is it not finding the directory of the pdf file or is the subprocess call just not working, or could it be an issue with docker?

I run docker-compose up myproject python3 -m unittest

My file structure:

├── myproject
├── ├── extract.py
│   ├── tests
│   │   ├── testExtract.py
│   │   ├── testfiles
|   |   |   ├── sample.pdf

pdftotext method:

def extract(filepath)
    text = subprocess.run(['pdftotext', filepath, '-'],
                                stdout=PIPE,
                                stderr=STDOUT)
    text = str(fullText.stdout)
    return text

test method inside testExtract.py :

testGetText(self):
    expected = "b'Grab all text from this sentence.'"
    result = extract('./testfiles/sample.pdf')
    self.assertEqual(result, expected)

sample.pdf only contains the above sentence.

When I set stderr=STDOUT I get an IO error displayed below. If I set to stderr=subprocess.PIPE I just get an empty binary string "" .

One issue could be that . is referring to somewhere else. I try ./tests/testFiles/1SentenceFile.pdf but it returns an empty binary string (no I/O error).

traceback:

FAIL: testGetText 
----------------------------------------------------------------------
Traceback (most recent call last):

AssertionError: 'b"I/O Error: Couldn\'t open file \'./tes[51 chars]ry."' != "b'Grab all text from this sentence.'"
- b"I/O Error: Couldn't open file './testFiles/1SentenceFile.pdf': No such file or directory."
+ b'Grab all text from this sentence.'
1 Like