Error from server (BadRequest): container "ray-head" in pod "raycluster-kuberay-head-fx4c9" is waiting to start: trying and failing to pull image

Follow instructions from quick start for Ray on k8s , I got exceptions while

 >  ✗ kubectl describe pod raycluster-kuberay-head-fx4c9  
     RAY_CLUSTER_NAME:                      (v1:metadata.labels['ray.io/cluster'])
      RAY_CLOUD_INSTANCE_ID:                raycluster-kuberay-head-fx4c9 (v1:metadata.name)
      RAY_NODE_TYPE_NAME:                    (v1:metadata.labels['ray.io/group'])
      KUBERAY_GEN_RAY_START_CMD:            ray start --head  --metrics-export-port=8080  --block  --dashboard-agent-listen-port=52365  --num-cpus=1  --memory=2000000000  --dashboard-host=0.0.0.0 
      RAY_PORT:                             6379
      RAY_ADDRESS:                          127.0.0.1:6379
      RAY_USAGE_STATS_KUBERAY_IN_USE:       1
      RAY_USAGE_STATS_EXTRA_TAGS:           kuberay_version=v1.2.1;kuberay_crd=RayCluster
      REDIS_PASSWORD:                       
      RAY_DASHBOARD_ENABLE_K8S_DISK_USAGE:  1
    Mounts:
      /dev/shm from shared-mem (rw)
      /tmp/ray from log-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8gxsm (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  log-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  shared-mem:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  2G
  kube-api-access-8gxsm:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Guaranteed
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  14m                   default-scheduler  Successfully assigned default/raycluster-kuberay-head-fx4c9 to minikube
  Warning  Failed     14m (x2 over 14m)     kubelet            Failed to pull image "rayproject/ray:2.34.0": Error response from daemon: Get "https://registry-1.docker.io/v2/": context deadline exceeded
  Normal   Pulling    12m (x4 over 14m)     kubelet            Pulling image "rayproject/ray:2.34.0"
  Warning  Failed     12m (x4 over 14m)     kubelet            Error: ErrImagePull
  Warning  Failed     12m (x2 over 13m)     kubelet            Failed to pull image "rayproject/ray:2.34.0": Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  Warning  Failed     12m (x6 over 14m)     kubelet            Error: ImagePullBackOff
  Normal   BackOff    4m52s (x36 over 14m)  kubelet            Back-off pulling image "rayproject/ray:2.34.0"

> ✗ kubectl logs raycluster-kuberay-head-fx4c9        
Error from server (BadRequest): container "ray-head" in pod "raycluster-kuberay-head-fx4c9" is waiting to start: trying and failing to pull image

I also tried another minikube cluster instead of kind cluster, but got the same output after installing raycluster helm install raycluster kuberay/ray-cluster --version 1.2.1

I can not understand the describe pod output the request timeout for https://registry-1.docker.io/v2/, as the the pull image succeeds when running docker directly:


> ✗ docker pull rayproject/ray:2.34.0                  
2.34.0: Pulling from rayproject/ray
9b857f539cb1: Pull complete 
6385f74c231a: Pull complete 
d93807efc02c: Pull complete 
4f4fb700ef54: Pull complete 
e00095e97bc6: Pull complete 
62bc57cef369: Pull complete 
0c606dbe74e6: Pull complete 
0d71e9581bc0: Pull complete 
83d53ff2b179: Pull complete 
Digest: sha256:d3f0831b510ce4569499540117a6d1b7c36b9f924616097657364d0c8f069f15
Status: Downloaded newer image for rayproject/ray:2.34.0
docker.io/rayproject/ray:2.34.0

Thanks for your suggestion.

Since you can pull the image with the docker client, it could be a Kubernetes networking issue. I would recommend asking about it on the Kubernetes forum

If you haven’t done it yet.