Pip installing the correct Python packages during cross-compiling

janeswh · July 12, 2023, 3:03pm

I’m trying to build a multi-platform image using multi-stage cross-compiling, but I’m a bit lost as to how and where I should be doing pip3 install in order to get the correct python packages for either amd64 or arm. Building via the below Dockerfile results in containers that run correctly on the native build platform, but import error for python packages on the non-native platform. (e.g. if I build on amd64 linux and pull the image on M1 mac, I get an import error for numpy). I’ve tried moving pip3 install to the second runtime stage but that doesn’t fix my problem. Any insight would be appreciated!

Dockerfile:

FROM --platform=$BUILDPLATFORM python:3.11-slim-bookworm AS build
ARG TARGETPLATFORM

ENV VIRTUAL_ENV=/opt/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"

COPY requirements.txt ./
RUN pip3 install -r requirements.txt

# ARG TARGETPLATFORM

FROM --platform=$TARGETPLATFORM python:3.11-slim-bookworm AS runtime
# setup user and group ids
ARG USER_ID=1000
ENV USER_ID $USER_ID
ARG GROUP_ID=1000
ENV GROUP_ID $GROUP_ID

# add non-root user and give permissions to workdir
RUN groupadd --gid $GROUP_ID user && \
          adduser user --ingroup user --gecos '' --disabled-password --uid $USER_ID && \
          mkdir -p /usr/src/app_dir && \
          chown -R user:user /usr/src/app_dir

# copy from build image
COPY --chown=user:user --from=build /opt/venv /opt/venv

RUN apt-get update && apt-get install --no-install-recommends -y tk \
    && rm -rf /var/lib/apt/lists/* 

# set working directory
WORKDIR /app_dir

# switch to non-root user
USER user

# Path
ENV PATH="/opt/venv/bin:$PATH"

COPY ./app .

EXPOSE 8501
HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
ENTRYPOINT ["streamlit", "run", "Home.py", "--server.fileWatcherType=none", "--server.port=8501", "--server.address=0.0.0.0", "--server.headless=true"]

Error message:

File "/opt/venv/lib/python3.11/site-packages/streamlit/type_util.py", line 40, in <module>
    import numpy as np
  File "/opt/venv/lib/python3.11/site-packages/numpy/__init__.py", line 139, in <module>
    from . import core
  File "/opt/venv/lib/python3.11/site-packages/numpy/core/__init__.py", line 49, in <module>
    raise ImportError(msg)
ImportError:

rimelek · July 12, 2023, 9:22pm

Python virtual environments can contain binaries as well, so I wouldn’t copy it to different architectures.
The error message shows numpy/core. If you go to the site packages in the venv, you will find so files:

cd venv/lib/python3.11/site-packages/numpy/core
find . -name '*.so'

./_multiarray_tests.cpython-311-aarch64-linux-gnu.so
./_umath_tests.cpython-311-aarch64-linux-gnu.so
./_operand_flag_tests.cpython-311-aarch64-linux-gnu.so
./_simd.cpython-311-aarch64-linux-gnu.so
./_rational_tests.cpython-311-aarch64-linux-gnu.so
./_struct_ufunc_tests.cpython-311-aarch64-linux-gnu.so
./_multiarray_umath.cpython-311-aarch64-linux-gnu.so

Those are binaries so you will need to use one stage and docker buildx build to build for multiple architetcures:

janeswh · July 12, 2023, 10:35pm

Gotcha, so basically option #3 in the Docker documentation won’t work, and I need to use either option 1 or 2 (qemu/native nodes) for docker buildx build? I’m assuming that I will need to remove the stages and the $BUILDPLATFORM and $TARGETPLATFORM variables from my Dockerfile?

rimelek · July 13, 2023, 5:18pm

If by “Option #3” you mean “Multi-stage builds” as the third menu item under “Building images”, then yes, multi-stage build will not help you. It can be useful for those kind of builds that the documentation shows (building a binary for the same architecture), but it has nothing to do with multi-platform builds. TARGETPLATFORM and BUILDPLATFORM variables are useful only in the following cases:

When you want to detect the architecture in a RUN instruction, you can specify the platform options in command line and if buildkit is enabled, you can use those variables to make sure you are downloading a binary from the internet for the right architecture. If you are using a package manager like apt or dnf architetcures are automatically detected but not for a HTTP request using curl for example.
When you have a builder that can build binaries for different architetcures itself which you can copy to another stage. “go” could be an example. So that you could run the build on the architecture of the host machine (BUILDPLATFORM) and generate a binary for another architecture (TARGETPLATFORM). It could be very useful in case the software which you use for building a binary is not running properly on a specific architecture or it is not running in an emulated environment.

yes

janeswh · July 14, 2023, 1:17pm

If by “Option #3” you mean “Multi-stage builds” as the third menu item under “Building images”, then yes, multi-stage build will not help you.

I actually meant #3 in the same page that you linked earlier:

You can build multi-platform images using three different strategies that are supported by Buildx and Dockerfiles:

Using the QEMU emulation support in the kernel

Building on multiple native nodes using the same builder instance

Using a stage in Dockerfile to cross-compile to different architectures

I was basically trying to have both a multi-stage build (for smaller image size), then trying to do #3 from the above (using TARGETPLATFORM and BUILDPLATFORM) in order to download the correct binary for the architecture via pip install, i.e. as described in your second bullet point.

But to make sure I’m understanding correctly, the pip install isn’t building binaries, just downloading them, so the PLATFORM variables don’t make sense here? So if I do docker buildx build without variables in the Dockerfile, for each platform in the image, pip will get the correct binaries for each architecture. I was also confused by how building a multi-platform image using the “QEMU emulation support in the kernel” is different from e.g. a M1 laptop running an amd64 image using QEMU emulation (and being very slow).

Thanks so much for your patience; I’m very new to all this!

rimelek · July 14, 2023, 7:48pm

Oh… Sorry I missed that list. When you scroll down to the details of that point you can read this:

Finally, depending on your project, the language that you use may have good support for cross-compilation.

So it is exactly what I explained using “go” as an example.

I see. Unfotunately pip can’t build for another achitecture. It was not designed for that.

As far as I know it could trigger a build but only for the architecture on which you are running pip. It doesn’t matter though, because downloading a prebuilt binary or building locally will not make any difference unless the tool you use to build or download binaries supports that. pip doesn’t.

docker buildx build can take multiple architectures, build and automatically push to the registry so your multi-platform image can be downloaded on another machine based on the platform of that. Someone also pointed out recently that buildx already supports creating the builds without pushing it, but I didn’t look into it. My point here is that you don’t need to run the build multiple times. If you want to use a different approach and create images like yourname/image:v1-arm64 and yourname/image:v1-amd64 that is also commonly used although in that case everyone needs to know their architetcure to pull the right image.

Docker Desktop uses the same qemu in its virtual machine. The only difference is that if you install qemu and the necessary libraries on Linux and use Docker CE, you can use all of the resources of that host, but emulating will always be slower and sometimes it just doesn’t work at all or just after hours or days of research and debugging. Fortunately I didn’t have that problem, but I have seen people who had.

Topic		Replies	Views
Docker buildx fail when trying to install python dependency with pip General docker , build	1	5324	October 31, 2021
Can't Run Multi-Stage Build on M1 General	3	220	September 8, 2024
Multi-platform build fails Docker Desktop build , macos	4	2374	September 30, 2024
Build multi-arch images with different commands per architecture in Docker file General docker , build	2	8756	March 3, 2023
No matching distributions for Python modules General	2	2194	December 27, 2021

Pip installing the correct Python packages during cross-compiling

Related topics