Complete failure of Docker

Hi,

I’m running latest docker on ubuntu 22.04.

I removed firewall and I’ve switched to iptables-legacy already.

Yet, on a brand new docker install:

$ sudo apt-get install docker.io
[sudo] password for us: 
Sorry, try again.
[sudo] password for us: 
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  bridge-utils containerd dnsmasq-base pigz runc ubuntu-fan
Suggested packages:
  ifupdown aufs-tools cgroupfs-mount | cgroup-lite debootstrap docker-doc rinse zfs-fuse | zfsutils
The following NEW packages will be installed:
  bridge-utils containerd dnsmasq-base docker.io pigz runc ubuntu-fan
0 upgraded, 7 newly installed, 0 to remove and 4 not upgraded.
Need to get 0 B/66.8 MB of archives.
After this operation, 287 MB of additional disk space will be used.
Do you want to continue? [Y/n] 
Preconfiguring packages ...
Selecting previously unselected package pigz.
(Reading database ... 126290 files and directories currently installed.)
Preparing to unpack .../0-pigz_2.6-1_amd64.deb ...
Unpacking pigz (2.6-1) ...
Selecting previously unselected package bridge-utils.
Preparing to unpack .../1-bridge-utils_1.7-1ubuntu3_amd64.deb ...
Unpacking bridge-utils (1.7-1ubuntu3) ...
Selecting previously unselected package runc.
Preparing to unpack .../2-runc_1.1.0-0ubuntu1.1_amd64.deb ...
Unpacking runc (1.1.0-0ubuntu1.1) ...
Selecting previously unselected package containerd.
Preparing to unpack .../3-containerd_1.5.9-0ubuntu3.1_amd64.deb ...
Unpacking containerd (1.5.9-0ubuntu3.1) ...
Selecting previously unselected package dnsmasq-base.
Preparing to unpack .../4-dnsmasq-base_2.86-1.1ubuntu0.1_amd64.deb ...
Unpacking dnsmasq-base (2.86-1.1ubuntu0.1) ...
Selecting previously unselected package docker.io.
Preparing to unpack .../5-docker.io_20.10.12-0ubuntu4_amd64.deb ...
Unpacking docker.io (20.10.12-0ubuntu4) ...
Selecting previously unselected package ubuntu-fan.
Preparing to unpack .../6-ubuntu-fan_0.12.16_all.deb ...
Unpacking ubuntu-fan (0.12.16) ...
Setting up dnsmasq-base (2.86-1.1ubuntu0.1) ...
Setting up runc (1.1.0-0ubuntu1.1) ...
Setting up bridge-utils (1.7-1ubuntu3) ...
Setting up pigz (2.6-1) ...
Setting up containerd (1.5.9-0ubuntu3.1) ...
Created symlink /etc/systemd/system/multi-user.target.wants/containerd.service → /lib/systemd/system/containerd.service.
Setting up ubuntu-fan (0.12.16) ...
Created symlink /etc/systemd/system/multi-user.target.wants/ubuntu-fan.service → /lib/systemd/system/ubuntu-fan.service.
Setting up docker.io (20.10.12-0ubuntu4) ...
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /lib/systemd/system/docker.service.
Created symlink /etc/systemd/system/sockets.target.wants/docker.socket → /lib/systemd/system/docker.socket.
Processing triggers for dbus (1.12.20-2ubuntu4.1) ...
Processing triggers for man-db (2.10.2-1) ...
[us:seagoat:~]
$ docker run --rm --name nginx -p 8081:80 nginx:alpine
Unable to find image 'nginx:alpine' locally
alpine: Pulling from library/nginx
63b65145d645: Pull complete 
8c7e1fd96380: Pull complete 
86c5246c96db: Pull complete 
b874033c43fb: Pull complete 
dbe1551bd73f: Pull complete 
0d4f6b3f3de6: Pull complete 
2a41f256c40f: Pull complete 
Digest: sha256:6f94b7f4208b5d5391246c83a96246ca204f15eaf7e636cefda4e6348c8f6101
Status: Downloaded newer image for nginx:alpine
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2023/02/11 14:21:42 [notice] 1#1: using the "epoll" event method
2023/02/11 14:21:42 [notice] 1#1: nginx/1.23.3
2023/02/11 14:21:42 [notice] 1#1: built by gcc 12.2.1 20220924 (Alpine 12.2.1_git20220924-r4) 
2023/02/11 14:21:42 [notice] 1#1: OS: Linux 5.15.0-60-generic
2023/02/11 14:21:42 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2023/02/11 14:21:42 [notice] 1#1: start worker processes
2023/02/11 14:21:42 [notice] 1#1: start worker process 30
2023/02/11 14:21:42 [notice] 1#1: start worker process 31
2023/02/11 14:21:42 [notice] 1#1: start worker process 32
2023/02/11 14:21:42 [notice] 1#1: start worker process 33

I run curl on the same machine:

$ curl -v http://localhost:8081/
*   Trying 127.0.0.1:8081...
* Connected to localhost (127.0.0.1) port 8081 (#0)
> GET / HTTP/1.1
> Host: localhost:8081
> User-Agent: curl/7.81.0
> Accept: */*
> 
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer

I’ve no clue why, I’ve spent a day already looking at forums and help, to no avail. This is a completely fresh install, and it is the most basic thing possible, and it fails.

It seems to boil down to bridge failing: 172.17.0.1 is pingable (this is the local IP of the docker0 interface), but 172.17.0.2 is not (this is supposedly the IP of the running nginx container).

Any ideas?

The category is “Docker Desktop for Linux”. Are you using Docker Desktop too?

  • docker.io package is provided by Ubuntu. It is not the recommended way to Install Docker CE
  • Docker Desktop runs a virtual machine and it is not the same as Docker CE
  • The recommended Docker CE installation is described here in the documentation: Install Docker Engine on Ubuntu | Docker Documentation

Also make sure you have only one Docker installed on your system.

Thank you for the tip!

I never expected that a package called docker.io is not in fact affiliated or recommended by docker.

I will give a try to the packages you mentioned and report back.

I took that as a “No” to my desktop-related question so I moved the topic under “DockerEngine” :slight_smile:

docker.io should work too, but it is better to install Docker from the repository mentioned in the documentation. I have seen docker.io failing when docker-ce (from Docker’s repository) worked fine.

The result is the same as with the previous version.

$ docker --version
Docker version 23.0.1, build a5ee5b1

What I also noticed was that something was not right with the interfaces, docker0 was down while the container was running:

$ ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 50:e5:49:ee:de:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.1/24 metric 1024 brd 10.0.0.255 scope global dynamic enp5s0
       valid_lft 86101sec preferred_lft 86101sec
    inet6 2a01:36d:111:438:52e5:49ff:feee:de06/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 594sec preferred_lft 594sec
    inet6 fe80::52e5:49ff:feee:de06/64 scope link 
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:42:4f:96:eb brd ff:ff:ff:ff:ff:ff
    inet6 fe80::42:42ff:fe4f:96eb/64 scope link 
       valid_lft forever preferred_lft forever
9: veth548d875@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 36:20:e0:79:e5:14 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::3420:e0ff:fe79:e514/64 scope link 
       valid_lft forever preferred_lft forever

And route was missing docker0 interface (but this could be normal if interface is mangled in iptables):

$ route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         ax86s           0.0.0.0         UG    1024   0        0 enp5s0
10.0.0.0        0.0.0.0         255.255.255.0   U     1024   0        0 enp5s0
ax86s           0.0.0.0         255.255.255.255 UH    1024   0        0 enp5s0

What now?

You should indeed see the routes in the routing table.

What happens when you list docker networks?

docker network ls

If you don’t have anything that you want to keep, you can try to stop docker, remove /var/lib/docker and start docker again.

sudo systemctl stop docker
sudo rm -rf /var/lib/docker
sudo systemctl start docker

If you have some data that you want to keep, you could try to remove the database file of the networks

sudo systemctl stop docker
unlink /var/lib/docker/network/files/local-kv.db
sudo systemctl start docker

Docker networks:

$ docker network ls
NETWORK ID     NAME      DRIVER    SCOPE
e3f83b36c00f   bridge    bridge    local
5a35c695f5d0   host      host      local
7a54fa0fd0ec   none      null      local

This is a brand new install of docker, but tried what you suggested anyway:

$ sudo systemctl stop docker
[sudo] password for us: 
Warning: Stopping docker.service, but it can still be activated by:
  docker.socket
$ systemctl status docker
○ docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
     Active: inactive (dead) since Sat 2023-02-11 18:02:30 CET; 11s ago
TriggeredBy: ● docker.socket
       Docs: https://docs.docker.com
    Process: 921 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=0/SUCCESS)
   Main PID: 921 (code=exited, status=0/SUCCESS)
        CPU: 1.366s

Feb 11 18:02:28 seagoat systemd[1]: Stopping Docker Application Container Engine...
Feb 11 18:02:30 seagoat dockerd[921]: time="2023-02-11T18:02:30.576005506+01:00" level=info msg="[core] [Channel #1] Channel Connectivity change to S>
Feb 11 18:02:30 seagoat dockerd[921]: time="2023-02-11T18:02:30.576067760+01:00" level=info msg="[core] [Channel #1 SubChannel #2] Subchannel Connect>
Feb 11 18:02:30 seagoat dockerd[921]: time="2023-02-11T18:02:30.576100513+01:00" level=info msg="[core] [Channel #1 SubChannel #2] Subchannel deleted>
Feb 11 18:02:30 seagoat dockerd[921]: time="2023-02-11T18:02:30.576118842+01:00" level=info msg="[core] [Channel #1] Channel deleted" module=grpc
Feb 11 18:02:30 seagoat dockerd[921]: time="2023-02-11T18:02:30.576232316+01:00" level=info msg="stopping event stream following graceful shutdown" e>
Feb 11 18:02:30 seagoat dockerd[921]: time="2023-02-11T18:02:30.576557771+01:00" level=info msg="Daemon shutdown complete"
Feb 11 18:02:30 seagoat systemd[1]: docker.service: Deactivated successfully.
Feb 11 18:02:30 seagoat systemd[1]: Stopped Docker Application Container Engine.
Feb 11 18:02:30 seagoat systemd[1]: docker.service: Consumed 1.366s CPU time.
$ sudo rm -rf /var/lib/docker/
$ sudo systemctl start docker
$ docker run --rm --name nginx -p 8081:80 nginx:alpine
Unable to find image 'nginx:alpine' locally
alpine: Pulling from library/nginx
63b65145d645: Pull complete 
8c7e1fd96380: Pull complete 
86c5246c96db: Pull complete 
b874033c43fb: Pull complete 
dbe1551bd73f: Pull complete 
0d4f6b3f3de6: Pull complete 
2a41f256c40f: Pull complete 
Digest: sha256:6f94b7f4208b5d5391246c83a96246ca204f15eaf7e636cefda4e6348c8f6101
Status: Downloaded newer image for nginx:alpine
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2023/02/11 17:03:41 [notice] 1#1: using the "epoll" event method
2023/02/11 17:03:41 [notice] 1#1: nginx/1.23.3
2023/02/11 17:03:41 [notice] 1#1: built by gcc 12.2.1 20220924 (Alpine 12.2.1_git20220924-r4) 
2023/02/11 17:03:41 [notice] 1#1: OS: Linux 5.15.0-60-generic
2023/02/11 17:03:41 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2023/02/11 17:03:41 [notice] 1#1: start worker processes
2023/02/11 17:03:41 [notice] 1#1: start worker process 30
2023/02/11 17:03:41 [notice] 1#1: start worker process 31
2023/02/11 17:03:41 [notice] 1#1: start worker process 32
2023/02/11 17:03:41 [notice] 1#1: start worker process 33

and then:

$ curl -v http://localhost:8081/
*   Trying 127.0.0.1:8081...
* Connected to localhost (127.0.0.1) port 8081 (#0)
> GET / HTTP/1.1
> Host: localhost:8081
> User-Agent: curl/7.81.0
> Accept: */*
> 
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer

Pinging the container doesn’t respond, and also, when I try to ping the host from the container, it fails:

$ docker exec -it nginx /bin/sh
/ # ifconfig
eth0      Link encap:Ethernet  HWaddr 02:42:AC:11:00:02  
          inet addr:172.17.0.2  Bcast:172.17.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:14 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:2908 (2.8 KiB)  TX bytes:858 (858.0 B)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

/ # ping 172.17.0.1
PING 172.17.0.1 (172.17.0.1): 56 data bytes
^C
--- 172.17.0.1 ping statistics ---
20 packets transmitted, 0 packets received, 100% packet loss
/ # 

Something is really messed up with the bridge, but I can’t figure out what and how. There is also virtually no doc or help on this topic.

It also works all right if I switch to host networking:

docker run --rm --network host --name nginx nginx:alpine

So it seems like container bridging is b0rked somehow, but I can’t figure out where and why. There are no logs and no indication as to what might be wrong here.

For reference, this is the bridge network info:

$ docker network inspect bridge
[
    {
        "Name": "bridge",
        "Id": "5489534409f555e6c6f65a0a0776c908766a896832d6cb6fea0f41989c37588c",
        "Created": "2023-02-11T20:14:41.21598395+01:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.17.0.0/16"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "79aef83f7d936c80aba33c7a298d558001319d3fa0c1c7f167248bf83157c022": {
                "Name": "nginx",
                "EndpointID": "3682ba2d17058af11ff7d722dca96fd42a38d41412564e8d4f5c69654ccb59e0",
                "MacAddress": "02:42:ac:11:00:02",
                "IPv4Address": "172.17.0.2/16",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.bridge.default_bridge": "true",
            "com.docker.network.bridge.enable_icc": "true",
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
            "com.docker.network.bridge.name": "docker0",
            "com.docker.network.driver.mtu": "1500"
        },
        "Labels": {}
    }
]

After a long long trial and error, the issue has resolved by adjusting systemd-networkd config. I used it to configure DHCP for the ethernet adapter. However, I did it with this config file /etc/systemd/network/20-local-wired.network:

[Match]
Name=*

[Network]
DHCP=yes

Although this was a heat-of-the-moment quickie to simplify the netplan/networkmanager mess, worked all right until docker’s fancy per-container interfaces appeared.

I don’t know why, but systemd-networkd somehow silently and weirdly breaks docker with the above config.

I’ve changed the match to en*, like this:

[Match]
Name=en*

[Network]
DHCP=yes

and suddently docker started working by the book.

Leaving this here for future google hits. :slight_smile:
Thanks for thinking along and being my rubber duck today!

1 Like

Thanks for sharing your solution so I could learn a new way to break container networking :slight_smile:
I used a similar approach only when I wanted to use netplan to manage some vlan interfaces, but not all, although I used the unmanaged-devices option of the NetworkManager and used /etc/netplan/network.yaml to define which interface should use DHCP and which shouldn’t. This is just another idea that could be useful sometime.

This is a level of networking which is not part of the official documentation so we have to deal with it using other sources. I am glad that you could figure it out. I don’t think I would have thought of that configuration.

When updating to 22.04, I’ve noticed that there are many ways to obtain IP via dhcp. I’ve noticed that systemd-networkd actually can do this, but it’s turned off by default. So I thought, let’s not complicate our lives, and removed both netplan.io and networkmanager, and simply enabled systemd-networkd to do its job. And it worked! Much less moving parts, just the way I like it.

With the above caveat. :stuck_out_tongue: