Containers can't communicate to each other

Hi there.
I believe this topic was raised not once. But I couldn’t find any reasonable solution. So here it is…
I have deployed Zabbix in containers. I have only one host where 3 containers run: zabbix-server, zabbix-web-nginx-mysql, zabbix-agent. All 3 are defined in a single docker-compose.yaml:

version: '3.5'
networks:
 zbx_net:
  driver: bridge

services:
 zabbix-server:
  image: zabbix/zabbix-server-mysql:alpine-5.2-latest
  ports:
   - "10051:10051"
  volumes:
   - /etc/localtime:/etc/localtime:ro
   - /etc/timezone:/etc/timezone:ro
   - ./zbx_env/usr/lib/zabbix/alertscripts:/usr/lib/zabbix/alertscripts:ro
   - ./zbx_env/usr/lib/zabbix/externalscripts:/usr/lib/zabbix/externalscripts:ro
   - ./zbx_env/var/lib/zabbix/export:/var/lib/zabbix/export:rw
   - ./zbx_env/var/lib/zabbix/modules:/var/lib/zabbix/modules:ro
   - ./zbx_env/var/lib/zabbix/enc:/var/lib/zabbix/enc:ro
   - ./zbx_env/var/lib/zabbix/ssh_keys:/var/lib/zabbix/ssh_keys:ro
   - ./zbx_env/var/lib/zabbix/mibs:/var/lib/zabbix/mibs:ro
  ulimits:
   nproc: 65535
   nofile:
    soft: 20000
    hard: 40000
  env_file:
   - .env_db_mysql
   - .env_srv
  secrets:
   - MYSQL_USER
   - MYSQL_PASSWORD
  networks:
   zbx_net:
    aliases:
    - zabbix-server
    - zabbix-server-mysql
    - zabbix-server-alpine-mysql
    - zabbix-server-mysql-alpine
  stop_grace_period: 30s
  sysctls:
   - net.ipv4.ip_local_port_range=1024 65000
   - net.ipv4.conf.all.accept_redirects=0
   - net.ipv4.conf.all.secure_redirects=0
   - net.ipv4.conf.all.send_redirects=0
  labels:
   com.zabbix.description: "Zabbix server with MySQL database support"
   com.zabbix.company: "Zabbix LLC"
   com.zabbix.component: "zabbix-server"
   com.zabbix.dbtype: "mysql"
   com.zabbix.os: "alpine"

 zabbix-web-nginx-mysql:
  image: zabbix/zabbix-web-nginx-mysql:alpine-5.2-latest
  ports:
   - "8081:8080"
   - "8443:8443"
  volumes:
   - /etc/localtime:/etc/localtime:ro
   - /etc/timezone:/etc/timezone:ro
   - ./zbx_env/etc/ssl/nginx:/etc/ssl/nginx:ro
   - ./zbx_env/usr/share/zabbix/modules/:/usr/share/zabbix/modules/:ro
  env_file:
   - .env_db_mysql
   - .env_web
  secrets:
   - MYSQL_USER
   - MYSQL_PASSWORD
  depends_on:
   - zabbix-server
  healthcheck:
   test: ["CMD", "curl", "-f", "http://localhost:8080/"]
   interval: 10s
   timeout: 5s
   retries: 3
   start_period: 30s
  networks:
   zbx_net:
    aliases:
     - zabbix-web-nginx-mysql
     - zabbix-web-nginx-alpine-mysql
     - zabbix-web-nginx-mysql-alpine
  stop_grace_period: 10s
  sysctls:
   - net.core.somaxconn=65535
  labels:
   com.zabbix.description: "Zabbix frontend on Nginx web-server with MySQL database support"
   com.zabbix.company: "Zabbix LLC"
   com.zabbix.component: "zabbix-frontend"
   com.zabbix.webserver: "nginx"
   com.zabbix.dbtype: "mysql"
   com.zabbix.os: "alpine"

 zabbix-agent:
  image: zabbix/zabbix-agent:alpine-5.2-latest
  ports:
   - "10050:10050"
  volumes:
   - /etc/localtime:/etc/localtime:ro
   - /etc/timezone:/etc/timezone:ro
   - ./zbx_env/etc/zabbix/zabbix_agentd.d:/etc/zabbix/zabbix_agentd.d:ro
   - ./zbx_env/var/lib/zabbix/modules:/var/lib/zabbix/modules:ro
   - ./zbx_env/var/lib/zabbix/enc:/var/lib/zabbix/enc:ro
   - ./zbx_env/var/lib/zabbix/ssh_keys:/var/lib/zabbix/ssh_keys:ro
  env_file:
   - .env_agent
  privileged: true
  pid: "host"
  networks:
   zbx_net:
      aliases:
       - zabbix-agent
       - zabbix-agent-passive
       - zabbix-agent-alpine
  stop_grace_period: 5s
  labels:
   com.zabbix.description: "Zabbix agent"
   com.zabbix.company: "Zabbix LLC"
   com.zabbix.component: "zabbix-agentd"
   com.zabbix.os: "alpine"
secrets:
  MYSQL_USER:
    file: ./.MYSQL_USER
  MYSQL_PASSWORD:
    file: ./.MYSQL_PASSWORD

All 3 containers are in same user-defined network. So according to Docker Compose documentation it should be enough that they are able to communicate. However if try to nc zabbix-server 10051 from zabbix-agent container the connection fails and I don’t see any connection attempt at zabbix-server.
I checked whether it can be caused by the firewall. I use nftables and created a rule allowing all traffic between 172.0.0.0/8 and 172.0.0.0/8. It helps partially in a way that I can connect to :10051 (like 172.26.0.1:10051). Although the zabbix-agent log still shows error that connection is refused so I think it still doesn’t work properly.

Anyway my primary concern is ability of containers to communicate to each other directly as it’s supposed to be. Also I tried to connect from zabbix-agent to zabbix-web-nginx-mysql:8081 and it also didn’t connect. So it’s my understanding that inter-container communication doesn’t work at all.

If I look at the zabbix_zbx_net network I can see all 3 containers are there:

[
    {
        "Name": "zabbix_zbx_net",
        "Id": "def0d254c1077d3874c74ebd6f93a9a9895683a2cc97ffe53a0fa2524649f790",
        "Created": "2020-12-22T10:02:59.55942359+01:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.27.0.0/16",
                    "Gateway": "172.27.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": true,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "95ed96686a7607b5c8aa22bc86f69916dcc117ae118859a5254d3e001df70de9": {
                "Name": "zabbix_zabbix-server_1",
                "EndpointID": "98e7c97886308716b5bf85bf5c8a4bb9655df9e3d79d34a05c8b9d6bca10ae15",
                "MacAddress": "02:42:ac:1b:00:03",
                "IPv4Address": "172.27.0.3/16",
                "IPv6Address": ""
            },
            "ab9216585795561226e608dc5f8a074de3d551f4e09f4caba48a111ec2d89c2b": {
                "Name": "zabbix_zabbix-web-nginx-mysql_1",
                "EndpointID": "0159042e1b64b7ac7f5ca3d675b7a855fa7d22aa42b4765877f4f09723f73307",
                "MacAddress": "02:42:ac:1b:00:04",
                "IPv4Address": "172.27.0.4/16",
                "IPv6Address": ""
            },
            "d17b6314e3f0ade5af7f7bf770fa72f82677e3d6fe82b62bea3ae05567ceb836": {
                "Name": "zabbix_zabbix-agent_1",
                "EndpointID": "8799fe8af03e945fb020d39e66912a20501c08d7b715b3f4aeed531f57392c65",
                "MacAddress": "02:42:ac:1b:00:02",
                "IPv4Address": "172.27.0.2/16",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {
            "com.docker.compose.network": "zbx_net",
            "com.docker.compose.project": "zabbix",
            "com.docker.compose.version": "1.25.0"
        }
    }
]

So I don’t know what else can be checked.

Within a user-defined network, regardless if bridged and overlay, service discovery is done via a network internal dns service.

A compose service will be registered with its service name, if provided the container name and all declared network aliases. Docker itself does not limit communication amongst services in a custom network.

Did you try to disable your firewall completly for the sake of testing?

Also: unless you don’t intend to deploy swarm stacks, there are no advantages in using a v3.x version of the compose file definition, but there are drawbacks as many low-level settings are missing. Stick with the latest v2.x, which should be still 2.4.

Yes, I tried to flush nftables and iptables rules completely. It’s still the same.

If I change version to 2.4 I get following errors:

Unsupported config option for services.zabbix-server: ‘secrets’
Unsupported config option for services.zabbix-web-nginx-mysql: ‘secrets’

I use secrets to store MySQL credentials.

Ah okay, I wasn’t aware secrets are not backported to 2.4. Though, many low-level features are not working with the 3.x version (which is aimed for swarm stack deplyments), as they have never been implemented for swarm services. The only added security you get from secrets is that the data is spread encrypted in swarm’s raft log (which you not seem to use) AND it’s content won’t be available in the envs of the container. On a single node deployment you can mimic the behavior by mounting a file into the container as read-only.

Usualy there is no need to manualy alter any iptables rules manualy.

Though, judged by the problems people reported in this forum, it seems that Docker on recent versions of Ubuntu and Docker does not behave as expected when it commes to publised ports or container network communication.

I indeed run Ubuntu 20.04 and Docker version 19.03.8. So should this be reported as a bug?

I don’t recall how the support period for relases from the year.month branches is exactly.
Though 19.03.8 must be from sometime arround November 2019? I doubt that there is still support for that version (though I migth be mistaken). You might want to update your Docker version to a more recent, still supported one first.

Well, apt says:

docker.io is already the newest version (19.03.8-0ubuntu1.20.04.1).

This means you are using the a re-diestributed package from the official Ubuntu repositories. Ubuntu is responsible for their support. Though, one would think that Ubuntu would fix whatever needs to be fixed on their own version of docker to make it play nice with their os…

The support I mentioned earlier is for the Docker CE version from the docker repositories.
You might want to switch to the offiicial docker-ce packages from the docker repositories: https://docs.docker.com/engine/install/ubuntu/

I’ve installed Docker from official repo. Now I have version Docker version 20.10.1, build 831ebea. The problem is still there

Now where you are using a supported version, and it still does not work, it is time to raise an issue on Docker’s github project.

I can only tell that it works like a charm on Ubuntu 18.04.5 LTS.

Ok, I tried completely simple setup:

$ docker network create test
$ docker run --network=test --detach nginx
$ docker run --network=test -it alpine
/ # nc 172.18.0.2 80          <----- I checked the nginx container's IP is this one

On my test machine with no firewall set up it works fine. I’ve checked though there are plenty of iptables rules created apparently by the Docker.

On the machine with nftables same setup didn’t work. However when I flushed nft ruleset but kept iptables it started working. So the problem finally is with the firewall. I however didn’t figure out yet what exactly rule should allow inter-container traffic inside one network

Now we are getting somewhere: I am not sure if Docker supports nftables… (see: https://github.com/moby/moby/issues/26824). But on the other hand we have issues that complain about nftables beeing slower? see: https://github.com/moby/moby/issues/39590

From what I remember the recommended fix for Debian Buster with nftables was to switch back to iptables.

I wasn’t aware Ubuntu switched to nftables on 20.04 as well.

I have uninstalled nftables. Interesting that the problem remained until I have rebooted the server. Then the problem has gone.
Thank you for the assistance.

1 Like

I am glad you found a workarround :slight_smile:

Thus said: the glue called docker will eventualy need to support new components for its glueing actions…