Docker registry layer / image size limits when pushing to self hosted registry

Hi there,
i have a few images that seem to hit some kinda limit when using it with my setup. From what i have gathered so far in the internet, there are size limits like max_body_size for nginx as well as some timeouts. But for me it seems it is neither of those because i have varying sies and times until pushing a layer fails.

Setup

I have a self hostet gitlab instance behind a traefik reverse proxy to which i added a docker registry. Its setup with docker, specifically with this docker-compose.yml.

services:
  redis:
    restart: always
    image: redis:6.2.6
    container_name: redis
    command:
      - --loglevel warning
    volumes:
      - redis-data:/data

  registry:
    restart: always
    image: registry:latest
    container_name: registry
    volumes:
      - /karla/registry/data/bind:/var/lib/registry
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.registry.entrypoints=web"
      - "traefik.http.routers.registry.rule=Host(`registry.mysite.com`)"
      #- "traefik.http.middlewares.registry-https-redirect.redirectscheme.scheme=https"
      #- "traefik.http.routers.registry.middlewares=registry-https-redirect"
      - "traefik.http.services.registry-secure.loadbalancer.server.port=5000"
      - "traefik.http.routers.registry-secure.entrypoints=websecure"
      - "traefik.http.routers.registry-secure.tls.certresolver=letsencrypt"
      - "traefik.http.routers.registry-secure.tls=true"
      - "traefik.http.routers.registry-secure.rule=Host(`registry.mysite.com`)"
      - traefik.docker.network=web_apps_net

  gitlab:
    restart: always
    image: gitlab/gitlab-ce:latest
    hostname: git.mysite.com
    container_name: git
    depends_on:
    - redis
    labels:
      #- "traefik.http.routers.gitlab.tls=true"
      - "traefik.http.routers.gitlab.tls.certresolver=letsencrypt"
      #- "traefik.http.routers.gitlab.entrypoints=websecure"
      #- "traefik.gitlab.redirect.permanent='true'"
      - traefik.enable=true
      - traefik.http.routers.gitlab_insecure.entrypoints=web
      - traefik.http.routers.gitlab_insecure.rule=Host(`git.mysite.com`)

      - traefik.http.routers.gitlab.entrypoints=websecure
      - traefik.http.routers.gitlab.rule=Host(`git.mysite.com`)
      #- traefik.http.routers.gitlab.tls.certresolver=letsencrypt-staging
      - traefik.http.services.gitlab.loadbalancer.server.port=80
      - traefik.docker.network=web_apps_net

      # Can't filter TCP traffic on SNI, see link below
      # https://community.containo.us/t/routing-ssh-traffic-with-traefik-v2/717/6
      - traefik.tcp.routers.gitlab-ssh.rule=HostSNI(`*`)
      - traefik.tcp.routers.gitlab-ssh.entrypoints=ssh
      - traefik.tcp.routers.gitlab-ssh.service=gitlab-ssh-svc
      - traefik.tcp.services.gitlab-ssh-svc.loadbalancer.server.port=22
    environment:
      GITLAB_OMNIBUS_CONFIG: |
        external_url 'https://git.mysite.com'
        gitlab_rails['initial_root_password'] = '${GITLAB_ROOT_PASSWORD}'
        nginx['listen_https'] = false
        nginx['listen_port'] = 80
        nginx['client_max_body_size'] = '200G'
    volumes:
    - gitlab-data:/var/opt/gitlab
    - /karla/gitlab/etc:/etc/gitlab
    - /karla/gitlab/logs:/var/log/gitlab/nginx

  gitlab-runner:
    image: gitlab/gitlab-runner:latest
    container_name: runner
    privileged: true
    restart: always
    volumes:
    - /karla/gitlab/runner-data/etc:/etc/gitlab-runner   # Correct mapping for runner config
    - /var/run/docker.sock:/var/run/docker.sock          # Correct mapping for Docker socket
    networks:
      - runner


networks:
  default:
    external: true
    name: web_apps_net
  runner:
    external: true
    name: runner-net

volumes:
  redis-data:
    driver_opts:
        type: "zfs"
        device: "karla/gitlab/redis-data"
  gitlab-data:
    driver_opts:
      type: "zfs"
      device: "karla/gitlab/gitlab-data"
  registry-data:
    driver_opts:
        type: "zfs"
        device: "karla/registry/data"

with this setup the registry is working and i can push images to it if they are not too big.

The Issue

My problems begin when i try to push large layers >800MB to the registry because they start to fail mid push at seemingly random sizes and times. After retrying several times i always get
unknown: Client Closed Request.
I tried to adjust:

  • proxy_read_timeout 600s;
  • proxy_send_timeout 600s;
  • send_timeout 600s;
  • keepalive_timeout 600s;
  • client_max_body_size 200G

which does not make any difference. I also tried to push one of the images in question to the official docker hub which uploads just fine. Hence i’m pretty sure my setup is borked.
Has someone an idea what else i could try?
Thanks in Advance

Have you tried setting the max body size in traefik as well?

see: Traefik Buffering Documentation - Traefik

Hi @meyay and thanks for your time!
I have added that and set it to 200GB

- "traefik.http.middlewares.limit.buffering.maxRequestBodyBytes=200000000000"

but it makes no difference. But i got a bit more information. Sometime i dont get the unknown error, but:

received unexpected HTTP status: 504 Gateway Timeout

the traefik log does not show anything related but the registrys log does (i think):

time="2024-11-25T08:45:30.377962426Z" level=error msg="response completed with error" err.code=unknown err.detail="client disconnected" err.message="unknown error" go.version=go1.20.8 http.request.contenttype="application/octet-stream" http.request.host=registry.mysite.com http.request.id=5eda5b66-[...] http.request.method=PATCH http.request.remoteaddr=[redacted] http.request.uri="/v2/cocotb/blobs/uploads/3421310e-f5a1-4750-87e9-9e901f7e64ff?_state=j7pJlYbz6Zhsb7k3X02o5zSZF-[...]" http.request.useragent="docker/27.3.1 go/go1.22.7 git-commit/41ca978 kernel/6.8.0-48-generic os/linux arch/amd64 UpstreamClient(Docker-Client/27.3.1 \(linux\))" http.response.contenttype="application/json; charset=utf-8" http.response.duration=58.433680062s http.response.status=500 http.response.written=89 vars.name=cocotb vars.uuid=3421310e-f5a1-4750-87e9-9e901f7e64ff 

so i tryed to set:

  - "traefik.http.services.registry.loadbalancer.server.readtimeout=120s"
  - "traefik.http.services.registry.loadbalancer.server.writetimeout=120s"
  - "traefik.http.services.registry.loadbalancer.server.idletimeout=120s"

but traefik doe not like that and i get:

2024-11-25T08:13:21Z ERR error="field not found, node: readtimeout"
2024-11-25T08:13:21Z ERR error="field not found, node: idletimeout"
2024-11-25T08:13:21Z ERR error="field not found, node: writetimeout"

found it :slight_smile:

      - "--entrypoints.web.transport.respondingTimeouts.readTimeout=180m"
      - "--entrypoints.websecure.transport.respondingTimeouts.readTimeout=180m"

did the trick for me.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.