All of a sudden I can't start new containers

Howdy all!

(First time poster, I hope I don’t mess up.)

Issue type: Operation failure
OS Version/build: Ubuntu Server 17.04
App version (docker I presume): 17.06.0-ce
Steps to reproduce: Start a new container

While I was messing around in Portainer today, I noticed I can’t run new containers any more. I thought it was due to space, I made a partitioning mistake and had to move /var/lib/docker; it is now simlinked.

But the situation is the same. I was disappointed to not be able to easily find some dockerd logs anywhere on the system, perhaps I’m rusty.

I did look into syslog and found this:

Aug  3 20:17:31 darkcove systemd-udevd[3600]: Could not generate persistent MAC address for vethd6694f8: No such file or directory
  Aug  3 20:17:31 darkcove systemd-udevd[3599]: Could not generate persistent MAC address for veth75a44ea: No such file or directory
    Aug  3 20:17:31 darkcove kernel: [ 1274.221014] docker0: port 7(vethd6694f8) entered blocking state
    Aug  3 20:17:31 darkcove kernel: [ 1274.221015] docker0: port 7(vethd6694f8) entered disabled state
    Aug  3 20:17:31 darkcove kernel: [ 1274.221094] device vethd6694f8 entered promiscuous mode
    Aug  3 20:17:31 darkcove kernel: [ 1274.221185] IPv6: ADDRCONF(NETDEV_UP): vethd6694f8: link is not ready
    Aug  3 20:17:31 darkcove kernel: [ 1274.221186] docker0: port 7(vethd6694f8) entered blocking state
    Aug  3 20:17:31 darkcove kernel: [ 1274.221187] docker0: port 7(vethd6694f8) entered forwarding state
    Aug  3 20:17:31 darkcove kernel: [ 1274.221801] docker0: port 7(vethd6694f8) entered disabled state
    Aug  3 20:17:32 darkcove kernel: [ 1274.492317] eth0: renamed from veth75a44ea
    Aug  3 20:17:32 darkcove kernel: [ 1274.508243] IPv6: ADDRCONF(NETDEV_CHANGE): vethd6694f8: link becomes ready
    Aug  3 20:17:32 darkcove kernel: [ 1274.508263] docker0: port 7(vethd6694f8) entered blocking state
    Aug  3 20:17:32 darkcove kernel: [ 1274.508265] docker0: port 7(vethd6694f8) entered forwarding state
    Aug  3 20:17:32 darkcove kernel: [ 1274.693722] docker0: port 7(vethd6694f8) entered disabled state
    Aug  3 20:17:32 darkcove kernel: [ 1274.693761] veth75a44ea: renamed from eth0
    Aug  3 20:17:32 darkcove kernel: [ 1274.751969] docker0: port 7(vethd6694f8) entered disabled state
    Aug  3 20:17:32 darkcove kernel: [ 1274.753209] device vethd6694f8 left promiscuous mode
    Aug  3 20:17:32 darkcove kernel: [ 1274.753211] docker0: port 7(vethd6694f8) entered disabled state

I also found mentions of a bug around this veth issue but could see a solution, it was fixed in code :weary: Any ideas?

1 Like

I’m new to docker myself but by the looks of it you’re having networking issues. From what I have picked up and from your log messages it seems that the docker group/user does not have network access. this may be as simple as adding the docker user/group to the networking one. It’s also possible you do not have the necessary networking drivers enabled in your kernel. I believe you can build these as modules next to the kernel and may not have to rebuild the kernel itself but do not know this for sure.

I would post your dmesg logs.

As I said I have limited experience but it feels like something I encountered and had to rebuild the kernel with the networking modules. Given your title I might be on the wrong path though.

Hey carel, thanks for your suggestions but I did not have this issue just a few hours before it started and, until that point, I was able to deploy new containers without issue. Nothing changed in therms of permissions or driver modules in kernel.

It seems like the system is not able to connect the veth device to the bridge network, but I don’t know what else to do to resolve it. Strange that the existing containers are still working just peachy.

This seems to be related to bug https://github.com/systemd/systemd/issues/3374 in systemd, but oddly still open since it’s a pretty bad one. I still haven’t found a workaround so now I’m stuck. Will try to export all my containers and redeploy docker, let’s see how that goes.

If anybody finds a fix, I would highly appreciate it. I can also provide debug logs from dockerd if necessary.

From the above linked issue a workaround.

# /etc/systemd/network/99-default.link
[Link]
NamePolicy=kernel database onboard slot path . 
MACAddressPolicy=none

I’m testing the fix now and it does appear to resolve the persistent mac address issue.