ADD unpacking and RUN wget from Google Drive URL

Hi,
I’m trying to configure my Dockerfile to fetch two tar.gz files located in a Google Drive account from the https://googledrive.com/host/<share_key> url. I want to grab the file and unpack it to a folder in my image so the files are available when my container runs.
I have tried to do this using RUN wget and ADD. With RUN wget && unzip it works in the docker-compose build stage - says file extracted fine and no errors but when I start my container with docker-compose up the files are not there!
After I read that ADD would automagically unpack my file I tried replacing my wget unzip logic by the ADD command and now I see the file when I run my container but it is not unpacked! My guess is that this is due to the being a google drive file and for some reason the ADD command downloads the file with the share_key name (no extensions).
EX: ADD https://googledrive.com/host/share_key /my/output/dir
will create the /my/output/dir/share_key file (without any extensions). Even if I hardcode the file name with .tar.gz it will not be unpacked - guess it uses the source name for deciding if it should be unpacked and not the destination name?

What am I missing here? I’ve looked at several dockerfiles and at least the WGET approach should’ve worked…

Any help greatly appreciated!

Thanks.

Hi,

I haven’t tried to download something from Google Drive but did it on Linux multiple times.
You have to create a public directory and access your files by relative reference with something like:
wget https://googledrive.com/host/FOLDER/file.tar.gz
or try it with the Linux commandline tool gdrive

Michael

Hi Michael,
Thank you for your suggestion. Apparently this works for everyone except for me. The files are there but not unpacked.

ADD https://googledrive.com/host/$MY_DRIVE_ID/my_file.tar.gz $WORKDIR/

The my_file.tar.gz appears in my $WORKDIR untouched…

Sigh

What if you just add a run after which unpacks the file?

Tried that and many many other alternatives. :frowning:

Could you please post your Dockerfile, otherwise it’s hard to guess

Hi Michael,

Here is the contents of my Dockerfile. Thanks a lot for taking an interest :smile:

FROM buildpack-deps:jessie

Linux deps

RUN set -ex
&& apt-get update
&& apt-get install -y postgresql-client libblas-dev liblapack-dev libatlas-base-dev gfortran
&& apt-get purge -y python.*

Python deps

ENV LANG C.UTF-8
ENV GPG_KEY C01E1CAD5EA2C4F0B8E3571504C367C218ADD4FF
ENV PYTHON_VERSION 2.7.11
ENV PYTHON_PIP_VERSION 8.1.1

RUN set -ex
&& curl -fSL “https://www.python.org/ftp/python/${PYTHON_VERSION%%[a-z]*}/Python-$PYTHON_VERSION.tar.xz” -o python.tar.xz
&& curl -fSL “https://www.python.org/ftp/python/${PYTHON_VERSION%%[a-z]*}/Python-$PYTHON_VERSION.tar.xz.asc” -o python.tar.xz.asc
&& export GNUPGHOME=“$(mktemp -d)”
&& gpg --keyserver ha.pool.sks-keyservers.net --recv-keys “$GPG_KEY”
&& gpg --batch --verify python.tar.xz.asc python.tar.xz
&& rm -r “$GNUPGHOME” python.tar.xz.asc
&& mkdir -p /usr/src/python
&& tar -xJC /usr/src/python --strip-components=1 -f python.tar.xz
&& rm python.tar.xz

&& cd /usr/src/python
&& ./configure --enable-shared --enable-unicode=ucs4
&& make -j$(nproc)
&& make install
&& ldconfig
&& curl -fSL ‘https://bootstrap.pypa.io/get-pip.py’ | python2
&& pip install --no-cache-dir --upgrade pip==$PYTHON_PIP_VERSION
&& find /usr/local
( -type d -a -name test -o -name tests )
-o ( -type f -a -name ‘.pyc’ -o -name '.pyo’ )
-exec rm -rf ‘{}’ +
&& rm -rf /usr/src/python

RUN pip install --no-cache-dir virtualenv

App

ENV WORKDIR /usr/src/app
ENV IMG_BASE_DIR $WORKDIR/static/images

RUN mkdir -p $WORKDIR
WORKDIR $WORKDIR
VOLUME $WORKDIR

COPY requirements.txt $WORKDIR
RUN pip install --no-cache-dir -r requirements.txt
COPY . $WORKDIR

Fetch datasets from Google Drive

ENV FACESCRUB_DRIVE_ID 0Bzt4aP7vYnOTYXpodUFmMlY4WVU
ADD https://googledrive.com/host/$FACESCRUB_DRIVE_ID/facescrub_dataset.tar.gz $WORKDIR/

This is a debug/dev dataset, it can be removed once the IMG_BASE_DIR is in S3

ADD https://googledrive.com/host/$FACESCRUB_DRIVE_ID/seed_images.tar.gz $IMG_BASE_DIR/

CMD /usr/local/bin/gunicorn web.wsgi:application -w 2 -b :8000 --reload

Cheers!

Marcelo