Docker Community Forums

Share and learn in the Docker community.

ADD unpacking and RUN wget from Google Drive URL


(Magrossi) #1

Hi,
I’m trying to configure my Dockerfile to fetch two tar.gz files located in a Google Drive account from the https://googledrive.com/host/<share_key> url. I want to grab the file and unpack it to a folder in my image so the files are available when my container runs.
I have tried to do this using RUN wget and ADD. With RUN wget && unzip it works in the docker-compose build stage - says file extracted fine and no errors but when I start my container with docker-compose up the files are not there!
After I read that ADD would automagically unpack my file I tried replacing my wget unzip logic by the ADD command and now I see the file when I run my container but it is not unpacked! My guess is that this is due to the being a google drive file and for some reason the ADD command downloads the file with the share_key name (no extensions).
EX: ADD https://googledrive.com/host/share_key /my/output/dir
will create the /my/output/dir/share_key file (without any extensions). Even if I hardcode the file name with .tar.gz it will not be unpacked - guess it uses the source name for deciding if it should be unpacked and not the destination name?

What am I missing here? I’ve looked at several dockerfiles and at least the WGET approach should’ve worked…

Any help greatly appreciated!

Thanks.


(Muellermich) #2

Hi,

I haven’t tried to download something from Google Drive but did it on Linux multiple times.
You have to create a public directory and access your files by relative reference with something like:
wget https://googledrive.com/host/FOLDER/file.tar.gz
or try it with the Linux commandline tool gdrive

Michael


(Magrossi) #3

Hi Michael,
Thank you for your suggestion. Apparently this works for everyone except for me. The files are there but not unpacked.

ADD https://googledrive.com/host/$MY_DRIVE_ID/my_file.tar.gz $WORKDIR/

The my_file.tar.gz appears in my $WORKDIR untouched…

Sigh


(Muellermich) #4

What if you just add a run after which unpacks the file?


(Magrossi) #5

Tried that and many many other alternatives. :frowning:


(Muellermich) #6

Could you please post your Dockerfile, otherwise it’s hard to guess


(Magrossi) #7

Hi Michael,

Here is the contents of my Dockerfile. Thanks a lot for taking an interest :smile:

FROM buildpack-deps:jessie

Linux deps

RUN set -ex
&& apt-get update
&& apt-get install -y postgresql-client libblas-dev liblapack-dev libatlas-base-dev gfortran
&& apt-get purge -y python.*

Python deps

ENV LANG C.UTF-8
ENV GPG_KEY C01E1CAD5EA2C4F0B8E3571504C367C218ADD4FF
ENV PYTHON_VERSION 2.7.11
ENV PYTHON_PIP_VERSION 8.1.1

RUN set -ex
&& curl -fSL “https://www.python.org/ftp/python/${PYTHON_VERSION%%[a-z]}/Python-$PYTHON_VERSION.tar.xz" -o python.tar.xz
&& curl -fSL "https://www.python.org/ftp/python/${PYTHON_VERSION%%[a-z]
}/Python-$PYTHON_VERSION.tar.xz.asc” -o python.tar.xz.asc
&& export GNUPGHOME="$(mktemp -d)"
&& gpg --keyserver ha.pool.sks-keyservers.net --recv-keys “$GPG_KEY”
&& gpg --batch --verify python.tar.xz.asc python.tar.xz
&& rm -r “$GNUPGHOME” python.tar.xz.asc
&& mkdir -p /usr/src/python
&& tar -xJC /usr/src/python --strip-components=1 -f python.tar.xz
&& rm python.tar.xz

&& cd /usr/src/python
&& ./configure --enable-shared --enable-unicode=ucs4
&& make -j$(nproc)
&& make install
&& ldconfig
&& curl -fSL ‘https://bootstrap.pypa.io/get-pip.py’ | python2
&& pip install --no-cache-dir --upgrade pip==$PYTHON_PIP_VERSION
&& find /usr/local
( -type d -a -name test -o -name tests )
-o ( -type f -a -name ‘.pyc’ -o -name '.pyo’ )
-exec rm -rf ‘{}’ +
&& rm -rf /usr/src/python

RUN pip install --no-cache-dir virtualenv

App

ENV WORKDIR /usr/src/app
ENV IMG_BASE_DIR $WORKDIR/static/images

RUN mkdir -p $WORKDIR
WORKDIR $WORKDIR
VOLUME $WORKDIR

COPY requirements.txt $WORKDIR
RUN pip install --no-cache-dir -r requirements.txt
COPY . $WORKDIR

Fetch datasets from Google Drive

ENV FACESCRUB_DRIVE_ID 0Bzt4aP7vYnOTYXpodUFmMlY4WVU
ADD https://googledrive.com/host/$FACESCRUB_DRIVE_ID/facescrub_dataset.tar.gz $WORKDIR/

This is a debug/dev dataset, it can be removed once the IMG_BASE_DIR is in S3

ADD https://googledrive.com/host/$FACESCRUB_DRIVE_ID/seed_images.tar.gz $IMG_BASE_DIR/

CMD /usr/local/bin/gunicorn web.wsgi:application -w 2 -b :8000 --reload

Cheers!

Marcelo