Can't run ANYTHING—even hello-world fails w/ "link not found" error

Been struggling with this problem on and off for six months and I have run completely out of ideas.

When executing any docker image, even hello-world, I get the following error:

# docker run hello-world
container_linux.go:247: starting container process caused "process_linux.go:334: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: time=\\\"2017-06-18T11:14:27-04:00\\\" level=fatal msg=\\\"failed to add interface vethb386a4d to sandbox: failed to get link by name \\\\\\\"vethb386a4d\\\\\\\": Link not found\\\" \\n\""
docker: Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:334: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: time=\\\"2017-06-18T11:14:27-04:00\\\" level=fatal msg=\\\"failed to add interface vethb386a4d to sandbox: failed to get link by name \\\\\\\"vethb386a4d\\\\\\\": Link not found\\\" \\n\"".
ERRO[0000] error getting events from daemon: net/http: request canceled 

The referenced veth changes each time, but other than that, the error is the same, no matter what docker image I try to run. Every single one fails. I also get a new veth showing up in ifconfig for every time I try to run a container—docker is failing to clean up the interfaces it creates, presumably because it dies before it gets to that stage in its process. None of the virtual interfaces have names that match the error messages, either.

Like, I literally cannot run any docker images. My workaround has been to create an LXC container and run my docker applications inside of that. It works, but feels pretty Inception-y.

Relevant environment data:

# uname -ar
Linux liquidity 4.8.0-54-generic #57~16.04.1-Ubuntu SMP Wed May 24 16:22:28 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.2 LTS
Release:	16.04
Codename:	xenial
# docker version
Client:
 Version:      17.05.0-ce
 API version:  1.29
 Go version:   go1.7.5
 Git commit:   89658be
 Built:        Thu May  4 22:10:54 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.05.0-ce
 API version:  1.29 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   89658be
 Built:        Thu May  4 22:10:54 2017
 OS/Arch:      linux/amd64
 Experimental: false
# docker info
Containers: 8
 Running: 0
 Paused: 0
 Stopped: 8
Images: 2
Server Version: 17.05.0-ce
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 18
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.8.0-54-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.5GiB
Name: liquidity
ID: ERRY:IHG2:J4GB:IGVD:6Y2Q:2J56:ZWIX:KD3K:2TVR:BWL3:V2BG:PMOA
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

The most recent troubleshooting step I took was upgrading my kernel from 4.4.0-72 to 4.8.0-54, then purging/reinstalling docker via the get.docker.com script. Absolutely no change.

Google has totally failed me. The error is apparently sufficiently funky that queries like this don’t return anything of any apparent value.

Any help at all would be appreciated. I’m happy to dig through log files and provide any info needed and I could probably set up tcpdump and capture some stuff if asked; I am totally at the end of my rope and posting this here is my last resort hail-mary.

Also attempted to manually create & use a bridge per these instructions, but it doesn’t change the problem behavior. Same error messages result.

edit - some elaboration:

Doing a brctl show comes back with this:

# brctl show
bridge name	bridge id		STP enabled	interfaces
dock0		8000.9eeed78f94b2	no		veth187aa50
lxdbr0		8000.fe21597ce281	no		vethBVI71A

The lxd bridge is for lxd, obviously, and the dock0 bridge is the one I manually created. It’s using veth187aa50, which does indeed exist:

# ifconfig
dock0     Link encap:Ethernet  HWaddr 9e:ee:d7:8f:94:b2  
          inet addr:172.17.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
          inet6 addr: fe80::b425:7fff:fe4b:ac08/64 Scope:Link
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:648 (648.0 B)

eth0      Link encap:Ethernet  HWaddr 0c:c4:7a:e1:65:7e  
          inet addr:50.28.11.223  Bcast:50.28.11.255  Mask:255.255.254.0
          inet6 addr: fe80::ec4:7aff:fee1:657e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:48797 errors:0 dropped:0 overruns:0 frame:0
          TX packets:21073 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:27553980 (27.5 MB)  TX bytes:10022074 (10.0 MB)
          Memory:df200000-df27ffff 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:14664 errors:0 dropped:0 overruns:0 frame:0
          TX packets:14664 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1 
          RX bytes:11201922 (11.2 MB)  TX bytes:11201922 (11.2 MB)

lxdbr0    Link encap:Ethernet  HWaddr fe:21:59:7c:e2:81  
          inet addr:10.170.6.1  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::8875:feff:fe5a:792c/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2259 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2234 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:6670133 (6.6 MB)  TX bytes:581593 (581.5 KB)

veth187aa50 Link encap:Ethernet  HWaddr 9e:ee:d7:8f:94:b2  
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

vethBVI71A Link encap:Ethernet  HWaddr fe:21:59:7c:e2:81  
          inet6 addr: fe80::fc21:59ff:fe7c:e281/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2259 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2239 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:6701759 (6.7 MB)  TX bytes:581983 (581.9 KB)

…but the failure message for the latest run shows yet another different veth:

# docker run hello-world
container_linux.go:247: starting container process caused "process_linux.go:334: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: time=\\\"2017-06-18T11:55:58-04:00\\\" level=fatal msg=\\\"failed to add interface veth8ce5e98 to sandbox: failed to get link by name \\\\\\\"veth8ce5e98\\\\\\\": Link not found\\\" \\n\""
docker: Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:334: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: time=\\\"2017-06-18T11:55:58-04:00\\\" level=fatal msg=\\\"failed to add interface veth8ce5e98 to sandbox: failed to get link by name \\\\\\\"veth8ce5e98\\\\\\\": Link not found\\\" \\n\"".
ERRO[0000] error getting events from daemon: net/http: request canceled 

Does the veth in the error message refer to the internal veth inside the container?

Regardless, problem behavior still exists with manually created bridge.

Also, here’s a link to the docker log file output with debug-level logging enabled: https://pastebin.com/4xzu8RqB