Can't resolve Docker Container names via Docker COmpose

I’m running docker engine on GCE - Ubuntu 22.04.4 LTS - installed as per the docs.

I’m running prometheus and grafana as a service like this

[Unit]
Description=Docker Compose Application Service for Grafana
Requires=docker.service
After=docker.service

[Service]
Restart=always
WorkingDirectory=/etc/docker-compose
ExecStart=/usr/bin/docker compose -f /etc/docker-compose/grafana-compose.yaml up
ExecStop=/usr/bin/docker compose -f /etc/docker-compose/grafana-compose.yaml down

[Install]
WantedBy=multi-user.target

The docker compose for each service is fairly straight forward - network config both the same

version: '3.8'

services:
  grafana:
    image: grafana/grafana:11.0.0
    container_name: grafana
    restart: unless-stopped
    volumes:
      - grafana_data:/var/lib/grafana
      # - ./grafana/provisioning/:etc/grafana/provisioning/
    networks:
      - monitoring
    ports:
      - "3000:3000"

networks:
  monitoring:
    driver: bridge

volumes:
  grafana_data:

A docker network list shows this (rest omitted)

dc766f838a69   docker-compose_monitoring   bridge    local

docket network inspect docker-compose_monitoring shows both containers in same network

"Containers": {
            "057e29a6e532511effbe13b3fa57490d67750357ed970898462b5e52c31a9ed6": {
                "Name": "prometheus",
                "EndpointID": "f0b9faede154c89bb982da849a9af2f240d85c943a2a01bf8f09237d77693494",
                "MacAddress": "02:42:ac:16:00:03",
                "IPv4Address": "172.22.0.3/16",
                "IPv6Address": ""
            },
            "6cc9f1e845337fc38aa0b034887f531523b21a7266c3db6a631587c3c46a21a0": {
                "Name": "grafana",
                "EndpointID": "11fc5cc4bbb57b0d23ebc74b72beb5baf7698d3aaf57d2d406e6515e3f27808c",
                "MacAddress": "02:42:ac:16:00:02",
                "IPv4Address": "172.22.0.2/16",
                "IPv6Address": ""
            }
        },

When i exec into grafana I can ping localhost:9090 and it works. Trying ping prometheus:9090 and 172.22.0.3:9090 doesn’t work.

When i try to add http://promethues:9090 as a datasource in Grafana I get this error back
Post "http://prometheus:9090/api/v1/query": dial tcp 172.23.0.3:9090: i/o timeout - There was an error returned querying the Prometheus API.

Trying http://localhost:9090 gives me
Post "http://localhost:9090/api/v1/query": dial tcp [::1]:9090: connect: connection refused - There was an error returned querying the Prometheus API.

What am I doing wrong? Tried this locally on my mac and works perfectly. Is this something missing config uniqiu to ubuntu that I’m missing?

Tried also creating a custom network and attaching containers but same result.

This is from my docker status

     CGroup: /system.slice/docker.service
             ├─1433907 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
             ├─1434210 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 9090 -container-ip 172.21.0.2 -container-port 9090
             ├─1434217 /usr/bin/docker-proxy -proto tcp -host-ip :: -host-port 9090 -container-ip 172.21.0.2 -container-port 9090
             ├─1434251 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 3000 -container-ip 172.21.0.3 -container-port 3000
             └─1434261 /usr/bin/docker-proxy -proto tcp -host-ip :: -host-port 3000 -container-ip 172.21.0.3 -container-port 3000

There are a few info warnings that say level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers

My resolv.conf incase relevant

nameserver 127.0.0.53
options edns0 trust-ad
search c.xx-nonprod-xxx.internal google.internal

You are using two compose files? Usually you would create one external Docker network and use it in both, or create in one compose (attachable: true) and use in the other (external: true).

yeah I’m using 2 compose files. Sorry I’m not too sure what you mean by

Usually you would create one external Docker network and use it in both, or create in one compose (attachable: true ) and use in the other (external: true )

I found the culprit - its related to this and this. Not going to try and explain it as I don’t fully understand iptables and nftables.

We disbaled nftable service completley just to prove it was the culprit and everything worked. but that leaves no firewall on the box.