Docker run always fails with a timeout error

Hi… I have installed docker for windows on 2 systems - Windows Server and Windows 10. I have issues on the Windows 2016 system.

When set to Linux containers I can pull a linux image and build & run containers with no issue. But when I switch to Windows containers then I have problems. For example

u:>docker run -it --name t1 microsoft/windowsservercore cmd
docker: Error response from daemon: container 3e22212b1b46c7085648610072625d65786fdaacb986ac3a378bc693553cbd1a encountered an error during Start: failure in a Windows system call: This operation returned because the timeout period expired. (0x5b4).

If I do the same thing on the Windows 10 system then there is no issue. I’m not even sure where to begin. I noticed that when set to Linux containers a “docker network ls” shows different networks (bridge/host/none) to when set to Windows (/nat/none)

is the name of a switch set up in HyperV. On the 2016 box this is actually a teamed NIC (maybe that’s an issue? but why would things be OK with Linux containers??)

When I compared the config of “nat” on both Windows 2016 and 10 (docker network inspect) I did see a difference… Windows 2016

        "Options": null,
        "Config": [
            {
                "Subnet": "172.29.192.0/20",
                "Gateway": "172.29.192.1"
            }

Windows 10

        "Options": null,
        "Config": [
            {
                "Subnet": "0.0.0.0/0",
                "Gateway": "0.0.0.0"
            }

But I don’t see where I could change these settings (and I would have thought the 2016 settings more likely to be ok than the Windows 10 settings).

Any pointers for what’s going wrong here?

I have had this problem before on one particular server running on VMware ESXi, where the server seemed to have periodic I/O slowdown. I haven’t seen the problem since and am now using EC2 instances on AWS. I did hear from a Microsoft engineer on this and here’s what he had to say:

“How much IO bandwidth do you have? Does it seem to hang at the highlighted line about 1 minute then give up? Right now if it can’t bring up a container within 1 minute it fails. The current mitigations are to try again when more IO is available, or use faster storage. Each container start takes a lot of IOPS because the registry needs to be specialized, logs rolled, etc in the new container layer”