DNS resolution behaviour is different when using kubernetes and on host system

I have a very weird issue that prevent cert-manager working inside kubernetes inside docker for desktop.

To give you some context :

I have a basic domain for local development :

@                  A    127.0.0.1
*                  A    127.0.0.1
_acme-challenge   TXT   xxxxxxxxxxxxxxxxxxxxx

I run this dns query on windows, I have this result (the expected one)

dig cname _acme-challenge.domain.dev.

; <<>> DiG 9.18.1-1ubuntu1.2-Ubuntu <<>> cname _acme-challenge.domain.dev.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30141
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1472
;; QUESTION SECTION:
;_acme-challenge.domain.dev.     IN      CNAME

;; AUTHORITY SECTION:
domain.dev.              78      IN      SOA     desi.ns.cloudflare.com. dns.cloudflare.com. 2304297281 10000 2400 604800 3600

;; Query time: 0 msec
;; SERVER: 172.28.208.1#53(172.28.208.1) (UDP)
;; WHEN: Tue Mar 14 18:02:22 CET 2023
;; MSG SIZE  rcvd: 116

Now If I run this dns query from a pod inside the k8s cluster I have this result :

dig cname _acme-challenge.domain.dev.

; <<>> DiG 9.9.5-9+deb8u19-Debian <<>> cname _acme-challenge.domain.dev.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34937
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;_acme-challenge.domain.dev.     IN      CNAME

;; ANSWER SECTION:
_acme-challenge.domain.dev. 5    IN      CNAME   _acme-challenge.domain.dev.

;; Query time: 3 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Tue Mar 14 17:01:

The CNAME record does not exists and creates an infinite loop in cert-manager.

I’m discussing with some people at CoreDNS to find the root cause of this issue here : DNS response is different when I use my os dns and when I use CoreDNS · coredns/coredns · Discussion #5971 · GitHub

It seems that this has something to do on how the cluster is working with docker for desktop.

Any idea on how to solve this in order to have cert-manager working properly when using Kubernetes with docker for desktop with windows 11 ?

Pointing a domain name to a loopback ip address like 127.0.0.1 will not work as this ip is different in each environment (host, virtual machine, container)

Since yit seems to be the same misunderstanding, I share my comment from another topic I wrote today

Click on the title for the picture at the end.

Hello,

Thanks for your response, but in my case the problem is not accessing a service on 127.0.0.1.

It’s a DNS issue, if you look at both DNS answers, there is something wrong when used inside kubernetes :

;; ANSWER SECTION:
_acme-challenge.domain.dev. 5    IN      CNAME   _acme-challenge.domain.dev.

This CNAME record does not exists, and it’s pointing to itself. There should be no answer for the this query.

Hello again,

More information on this issue, one of my friend installed docker for desktop for linux and this DNS query is working when ran inside kubernetes.

d:~# dig _acme-challenge.domain.dev. CNAME

; <<>> DiG 9.18.11 <<>> _acme-challenge.domain.dev. CNAME
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 37162
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 45fa723b42b1f39e (echoed)
;; QUESTION SECTION:
;_acme-challenge.domain.dev.     IN      CNAME

;; Query time: 57 msec
;; SERVER: 10.96.0.10#53(10.96.0.10) (UDP)
;; WHEN: Tue Mar 21 09:45:38 UTC 2023
;; MSG SIZE  rcvd: 66

I tried this on another windows 11 machine and it failed.

I wiped windows 11 on the second machine, then I installed Ubuntu 22.04, docker for desktop and enabled the kubernetes cluster and the DNS query is working normally.

Clearly the issue seems to be related to windows.

Sorry I have to reply quickly this time, but I will try to come back later.

Just because you mentioned it multiple times: it is “Docker Desktop” without the for :slight_smile: For is always between “Docker Desktop” and the operating System. Like in the name of the topic category.

I didn’t have time to read everything on GitHub so it is possible I misunderstood the problem.
Name resolution on Windows and Linux is different which can indeed cause strange behaviors, although you can set the DNS server in Kubernetes directly to an IP address which is available from the virtual machine of Docker Desktop.

Since Your pods will use CoreDNS for resolving domain names, you can configure CoreDNS to forward dns queries to another nameserver in case CoreDNS doesn’T know about the domain.

If you set a specific ip address to be the external nameserver not based on the resolv.conf, that should work on any system.

Forgive me if I misunderstood again :slight_smile: I hope I could still give you some ideas until I have more time.

Thanks, after digging, the issue is most probably coming from vpnkit under windows.

I’ll have to open an issue there.

Ok, so here are the latest results information :
Linux : it’s working
MacOS : it’s working

The issue is only on Windows.
When the DNS query is done, if we analyse the response with wireshark on windows, the response is correct, then if we analyse the response within a pod with tcpdump the response is modified somewhere between windows and the pod.

Do you have any idea on what could be the next thing I can try to isolate the issue ?

You can report the issue on GitHub

and share what you found out. I don’t know vpnkit enough to tell you what and why it does on Windows or why it is not compatible with Windows. If you don’t whant to change the configuration of the coredns pod as I suggested, you could have better luck on GitHub where developers can see the issue.

You can also link this topic and the coredns issue there.