Setting tcp_keepalive_time param for containers

Hi there,

We are running into an issue, which seems be a trivial one given the maturity of Docker framework. I could not find a reliable solution from digging around, hence this is my last hope.

We are running out production systems in DCOS, with Docker version 1.12(Commit id: d5236f0).

Goal:
We want to set the value for tcp_keepalive_time param in the container.

Approach 1:

  • Modified the docker-compose.yml.tmpl and set the value using sysctl.
    sysctls:
      - net.ipv4.tcp_keepalive_intvl=45
      - net.ipv4.tcp_keepalive_probes=15
      - net.ipv4.tcp_keepalive_time=295

Result:
Did not work. I can see the value set as the env parameter, however netstat shows that the socket is not picking up that value.

Approach 2:

  • Tried to modify the value via sysctl.
    Result:
    Failed since the procfs is a read-only file system on the container.

Can someone help me on how to set this param effectively on the container?

Thanks.

I am doing this at the moment.
For the first approach, it it working, I checked in the container /proc/sys/net/ipv4/tpc_keepalive_intvl, etc … the value is good. In the docker-compose.yml file, you have to be carfefull not to add any spaces between the key and the value.

For your second approach, you must use the privileged flag in your docker-compose.yaml, as you are modifying the kernel settings.

privileged: true

I was trying to solve a similar problem and came across this page. It is missing some important pieces of information, and thus motivates my response.

Setting the tcp_keepalive parameters within a container requires a kernel level of 4.13 on the base host. If you try this on an earlier kernel level, like the 3.10 kernel of CentOS 7.x, then these parameters will be missing from /proc and the command will fail in either case. In our case, we were running an older kernel and the way to accomplish this is to set the parameter in the base host only. You can do this with sysctl -w command, but that only works until the next reboot. If you hook into /etc/sysctl.conf or /etc/sysctl.d/, then it can be set automatically when the system comes up.

Please note that you’ll need to restart your containers after making this change on the base host.

I haven’t yet tried Fedora Core 27 or Ubuntu 17.10, both of which have the required kernel needed for this feature, but I suspect from the previous response that you’ll be able to set this on a per container basis with that kernel version.

We came across a very similar issue where what we observed was a Spring Boot application, when running as a service was working fine. But when deployed the same as a docker container, became unresponsive after some time, especially when left idle, i.e not used or no API calls made for some time.
What solved it by tuning the TCP keepalive kernel settings on Linux. The internal swarm loadbalancer purges all idle connections after 900 seconds.
So if you set keep alive to something less than 900 seconds, the problem of unresponsiveness will be solved.

1 Like

I am still having an issue where keep_alive is set to 600 on the host (it was 7200, but set via systctl.conf). But any running containers still get a value of 7200. This makes containers running in a swarm getting connection timeouts because of reasons previously explaind here.

Notice I cant use --sysctl because these containers will be running in a swarm and thats currently unsupported: https://github.com/moby/moby/issues/25209

Any idea how I can make the container get 600 too?

$ uname -r
4.15.0-42-generic
$ docker --version
Docker version 18.06.1-ce, build e68fc7a

$ sysctl net.ipv4.tcp_keepalive_time
net.ipv4.tcp_keepalive_time = 600

$ docker run --rm ubuntu:latest sysctl net.ipv4.tcp_keepalive_time
net.ipv4.tcp_keepalive_time = 7200

$ docker run --rm -it ubuntu:latest bash
root@7ac6e30f911a:/# sysctl -w net.ipv4.tcp_keepalive_time = 7200
sysctl: setting key "net.ipv4.tcp_keepalive_time": Read-only file system

Have you tried enabling the TCP_Keep_Alive parameters on your host, persisting them and then redeploying your container?

Thanks your your suggestion. I also asked this on stackoverflow and this answers explains it very well. https://stackoverflow.com/a/54564456/1758990

1 Like

Sysctl support for docker swarm cluster added in Docker 19.03.0

1 Like