I’m relatively new to docker swarm. I have a swarm setup with one manager and two workers. Accessing directly with workers IP address, works fine, but when using a load balancer service (provided by my cloud service), it seems it reaches the workers but it can’t connect to port. This is my setup (Public IPs were replaced):
LOAD BALANCER :
Public IP: 130.0.0.130
Private IP: 172.16.1.1
WORKER1 :
Public IP: 130.0.0.140
Private IP (LB): 172.16.1.2
Private IP (Docker): 10.0.0.10
WORKER2 :
Public IP: 130.0.0.150
Private IP (LB): 172.16.1.3
Private IP (Docker): 10.0.0.11
MANAGER:
Public IP: 130.0.0.160
Private IP (Docker): 10.0.0.5
I have 2 stacks running:
1. HAProxy:
version: '3.2'
services:
ha:
image: haproxytech/haproxy-debian:2.7
ports:
- published: 80
target: 80
protocol: tcp
mode: host
volumes:
- type: bind
source: "/etc/haproxy/"
target: "/etc/haproxy/"
read_only: true
networks:
- hanet
dns:
- 127.0.0.1
deploy:
mode: global
placement:
constraints: [node.role==worker]
networks:
hanet:
driver: overlay
haproxy.cfg:
global
log fd@2 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
stats socket /var/lib/haproxy/stats expose-fd listeners
master-worker
resolvers docker
nameserver dns1 127.0.0.11:53
resolve_retries 3
timeout resolve 1s
timeout retry 1s
hold other 10s
hold refused 10s
hold nx 10s
hold timeout 10s
hold valid 10s
hold obsolete 10s
defaults
timeout connect 10s
timeout client 30s
timeout server 30s
log global
mode http
option httplog
frontend fe_web
bind *:80
default_backend be_service
backend be_service
balance roundrobin
server-template wp- 2 site_wordpress:80 check resolvers docker init-addr libc,none
2. Wordpress
version: '3.2'
services:
wordpress:
image: wordpress:6.0.0-apache
volumes:
- type: bind
source: "/var/www/"
target: "/var/www/html/wp-content/"
networks:
- hanet
deploy:
mode: replicated
replicas: 1
endpoint_mode: dnsrr
placement:
constraints: [node.role==worker]
networks:
hanet:
external:
name: haproxy_hanet
General checks:
- Nodes are up and running
- Services are working fine
- Even if 1 instance of wordpress is running, you can access it correctly in any of the workers public IP addresses (HAProxy is working without issues)
- Firewall in each worker is disabled (for testing)
- Ping works from any worker to the load balancer
- Docker networks:
NETWORK ID NAME DRIVER SCOPE
8j3slzxfgoqd agent_agent_network overlay swarm
fb3a56f97fd5 bridge bridge local
04f5faf70051 docker_gwbridge bridge local
weijdpq2bpcj haproxy_hanet overlay swarm
7462a6a83e27 host host local
s6fx2m3hh5ba ingress overlay swarm
7b7ed9c3c89c none null local
- Interfaces (workers):
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether fa:16:3e:59:db:13 brd ff:ff:ff:ff:ff:ff
altname enp0s3
inet 130.0.0.140/24 brd 130.0.0.255 scope global ens3
valid_lft forever preferred_lft forever
3: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether fa:16:3e:43:c8:ac brd ff:ff:ff:ff:ff:ff
altname enp0s4
inet 10.0.0.10/24 brd 10.0.0.255 scope global ens4
valid_lft forever preferred_lft forever
4: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether fa:16:3e:3f:40:83 brd ff:ff:ff:ff:ff:ff
altname enp0s5
inet 172.16.1.2/24 brd 172.16.1.255 scope global ens5
valid_lft forever preferred_lft forever
5: docker_gwbridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:d1:04:00:7e brd ff:ff:ff:ff:ff:ff
inet 172.18.0.1/16 brd 172.18.255.255 scope global docker_gwbridge
valid_lft forever preferred_lft forever
6: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:0b:3e:b2:0f brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
# and some other virtual interfaces ...
- Netstat shows the port is bind to all interfaces:
root@worker1: netstat -na | grep "LISTEN "
tcp 0 0 0.0.0.0:9001 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:34175 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN ***
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:24007 0.0.0.0:* LISTEN
tcp6 0 0 :::9001 :::* LISTEN
tcp6 0 0 :::111 :::* LISTEN
tcp6 0 0 :::80 :::* LISTEN
tcp6 0 0 :::22 :::* LISTEN
tcp6 0 0 :::7946 :::* LISTEN
- Nmap reports
open
from manager (same for worker2):
root@manager: nmap -p80 worker1
Starting Nmap 7.80 ( https://nmap.org ) at 2022-06-16 02:39 UTC
Nmap scan report for worker1 (127.0.1.1)
Host is up (0.00014s latency).
PORT STATE SERVICE
80/tcp open http
Nmap done: 1 IP address (1 host up) scanned in 0.10 seconds
- Nmap shows filtered when done from inside the workers node (except for 127.0.0.1):
root@worker1: nmap -p80 130.0.0.140
80/tcp filtered http
root@worker1: nmap -p80 10.0.0.10
80/tcp filtered http
root@worker1: nmap -p80 172.16.1.2
80/tcp filtered http
root@worker1: nmap -p80 172.18.0.1
80/tcp filtered http
root@worker1: nmap -p80 172.17.0.1
80/tcp filtered http
root@worker1: nmap -p80 127.0.0.1
80/tcp open http
tcpdump` in workers displays the connection reaching them when accessing the load balancer public IP address (130.0.0.130):
root@worker1: tcpdump -i ens5 -vvv
02:25:59.405938 IP (tos 0x0, ttl 55, id 18362, offset 0, flags [DF], proto TCP (6), length 60)
static-222-222-111-111.b-fam.host.example.com.49208 > 172.16.1.3.http: Flags [S], cksum 0x7e2d (correct), seq 574746115, win 64240, options [mss 1414,sackOK,TS val 3130339086 ecr 0,nop,wscale 7], length 0
02:25:59.405938 IP (tos 0x0, ttl 55, id 29373, offset 0, flags [DF], proto TCP (6), length 60)
static-222-222-111-111.b-fam.host.example.com.49206 > worker1.http: Flags [S], cksum 0x4a6b (correct), seq 947981193, win 64240, options [mss 1414,sackOK,TS val 3130339086 ecr 0,nop,wscale 7], length 0
02:25:59.409034 IP (tos 0x0, ttl 55, id 18362, offset 0, flags [DF], proto TCP (6), length 60)
02:25:59.539267 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has worker1 tell 172.16.1.1, length 28
02:25:59.539290 ARP, Ethernet (len 6), IPv4 (len 4), Reply worker1 is-at fa:16:3e:3f:40:83 (oui Unknown), length 28
It seems to me that both workers were contacted in port 80 by the load balancer, but they didn’t get any packet (only flag S
it is shown).
Questions:
- What do I need to do to be able to access the site using the load balancer public address?
- It is normal that a stack service shows ‘filtered’ in any address except 127.0.0.1 ?
Thank you