How to run your own DNS resolver (using DNS-over-HTTPS) in Kubernetes using cloudflared

11.05.2022 | Johannes Kastl in howto

Prerequisites

Obviously, you need a Kubernetes cluster. Whether this is a full-fledged one running at a cloud provider of your choice or a small k3s single-node cluster on your Raspberry Pi does not matter (much), the setup is pretty generic.

You need to have access to your cluster using kubectl and helm. The instructions below assume that your kubeconfig file or environment variable are pointing to the right cluster, so using plain kubectl or helm is enough.

Adding the helm repository for cloudflared

We will be using a community-maintained helm chart from Pascal Iske that is available on the ArtifactHub. Feel free to peruse the instructions on the page.

For the impatient reader, here is the shortcut:

helm repo add pascaliske https://charts.pascaliske.dev
helm repo update

Creating the values.yaml for the installation via helm

It is advisable to use a values.yaml file for all installations via helm, rather than using the --set foo=bar commandline flags. For one, the file can be kept locally, whereas your shell history might easily get lost. Second, you could check that file into your version control system (think git).

To get an idea what values are supported by the helm chart, use the helm show values pascaliske/cloudflared command, which gives you the complete defaults from the helm chart.

Create a new file values.yaml and add the necessary options, maybe something like this:

env:
  - name: TZ
    value: Europe/Berlin
  - name: TUNNEL_DNS_UPSTREAM
    value: https://dns.digitale-gesellschaft.ch/dns-query
service:
  dns:
    enabled: true
    type: LoadBalancer
    port: 53
  metrics:
    enabled: true
    type: ClusterIP

We specified two sections: environment and service. Let’s have a look at each of them.

The env section configures the behaviour of the cloudflared application itself inside the pod by specifying environment variables. In our case, we use the Europe/Berlin timezone and specify which upstream server we want to use. For this example I picked the server hosted by Digitale Gesellschaft in Switzerland, but you can also use the Mozilla server or any DoH server. You can (and should) use more than one server, but for the sake of simplicity let’s start with one.

The next section is for service, which means it configures the Kubernetes resources related to Services. We enable the dns service and specify that we want to use a service of type LoadBalancer. This is the easiest to setup, but might be pricey if you are running it in a public cloud, where each and every LoadBalancer costs a lot of money. In this case you might want to switch to using a service of type ClusterIP. We will have a short look at how to make your resolver available in this case.

The port option for the dns service is where you specify which ports the service should listen on. The cloudflared application running inside the container will always listen on port 5053. But this option allows us to make the service reachable from the outside on port 53 (on the LoadBalancer IP). To make our resolver usable with normal clients, using port 53 is the only sensible option. While you can use different ports when testing with e.g. dig or drill, most operation systems do not allow setting a port for DNS resolution in e.g. /etc/resolv.conf.

We’ll skip the metrics part for now and have a closer look at it later on.

Installing cloudflared

With the values.yaml file at hand, we can install cloudflared into our cluster into its own namespace called cloudflared:

helm install cloudflared pascaliske/cloudflared -n cloudflared --create-namespace -f values.yaml

After that, you should see one pod running and three services being created:

kubectl get pods,svc
NAME                              READY   STATUS    RESTARTS       AGE
pod/cloudflared-97874dd45-vx2w9   1/1     Running   0              10m

NAME                          TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)           AGE
service/cloudflared-dns-udp   LoadBalancer   10.82.242.51   192.168.99.166   53:30842/UDP      10m
service/cloudflared-dns-tcp   LoadBalancer   10.82.150.97   192.168.99.166   53:30979/TCP      10m
service/cloudflared-metrics   ClusterIP      10.82.8.11     <none>           49312/TCP         10m

Wait, three services? We only specified two services above (dns and metrics), where is the other one coming from? Apparently we have one each for DNS using UDP and DNS using TCP? Why can’t we combine them into one service?

We need to take a little detour into Kubernetes services of type LoadBalancer here. For historical reasons, you cannot run TCP and UDP services on Loadbalancers in cloud environments. So, as DNS unfortunately uses both UDP and TCP and switches between the two if needed (requests bigger than 512 bytes will be resent using TCP) we need different services.

If you are running your Kubernetes cluster on baremetal (and e.g. a Raspberry Pi counts as baremetal in this case) and you are using MetalLB, then you are lucky. For MetalLB, you can workaround this issue and tell MetalLB that those services are allowed to run on the same LoadBalancer IP (in the example 192.168.99.166) by annotating the services. Modify the values.yaml like this:

[...]
service:
  dns:
    enabled: true
    type: LoadBalancer
    port: 53
    annotations:
      metallb.universe.tf/allow-shared-ip: "allow cloudflared services on the same IP"
[...]

The value for the annotation does not matter, as long as it is identical on the services that should be allowed to share the IP (and only those services).

You need to check your cloud provider’s documentation, if sharing the LoadBalancer’s IP is possible. If not, using services of type ClusterIP and exposing them via e.g. Traefik might be the better approach, see below.

An alternative would be to only force your clients to use TCP, and only expose the TCP service. Please note that this is not yet possible with the current version of the chart.

Checking the resolver

Now that we have a running resolver, let’s check if it is working. You can use all of the usual tools like dig,drill or nslookup. For dig it would look like this:

$ dig +short +tcp @192.168.99.166 b1-systems.de
95.216.238.34
$ dig +short +notcp @192.168.99.166 b1-systems.de
95.216.238.34

As you can see, we get a valid reply when trying to resolve b1-systems.de using UDP or TCP.

What’s up with the metrics?

In the output showing our Kubernetes services we noticed that there is a third service called cloudflared-metrics. We enabled this in our values.yaml and set the type to ClusterIP. But what does it do?

The cloudflared application can output metrics that can be scraped by e.g. Prometheus. Depending on the setup of your monitoring, you might need to adapt the settings we made in the values.yaml.

Enabling the service in the values.yaml already gives you a Kubernetes service resource. By using a type of ClusterIP we do not waste another LoadBalancer IP for this service. But this means that by default it is not reachable from outside the cluster.

As the metrics are provided over a simple HTTP endpoint you can put an Ingress in front of it if you want to scrape it from outside the cluster. In case you scrape from inside the cluster, e.g. using the kube-prometheus stack (or something like the cluster monitoring if you manage your clusters with Rancher), you do not need to expose it to the outside, so the settings we used are fine.

How to expose your service without a LoadBalancer

It might be generally preferrable to not use a LoadBalancer for each and every Kubernetes Service. In this case you can use Traefik as an Ingress Controller to make your resolver available. Please note that this also has the same limitations regarding UDP and TCP, although in this case it would be the Traefik services for TCP and UDP that are running as Kubernetes services of type LoadBalancer

In a previous article on running Blocky I described how to configure Traefik in detail, here is the short version:

You need to add the entrypoints for DNS to your Traefik installation, using high ports (so the Traefik pod can still run as non-root). Let’s call them dns-udp and dns-tcp.

If you are using MetalLB, you can annotate your Traefik services to allow them to run on the same Loadbalancer IP, just like we did above with the cloudflared services.

You can then create resources of type IngressRouteTCP and IngressRouteUDP that forward all traffic they receive on port 53 to the cloudflared service on port 53 (which will forward it to the targetPort 5053 which is where the cloudflared application inside the pod is listening). As an example, a IngressRouteUDP would look like this:

apiVersion: traefik.containo.us/v1alpha1
kind: IngressRouteUDP
metadata:
  name: cloudflared-udp
  namespace: cloudflared
spec:
  entryPoints:
    - dns-udp
  routes:
  - services:
    - name: cloudflared-dns-udp
      port: 53

Conclusion

As usual with Kubernetes, the actual application is easy to run if you have a well-built and well-maintained image. It is the surrounding bits and pieces that make it a little more challenging. But I hope I was able to show you how to start and set up your own cloudflared resolver in Kubernetes. Have fun!

Johannes Kastl
Johannes Kastl
Johannes is a Linux trainer and consultant and has been with B1 Systems since 2017. His topics include configuration management (Ansible, Salt, Chef, Puppet), version control (git), Infrastructure as Code (Terraform) and automation (Jenkins) as well as testing (Inspec, anyone?). At daytime he works as a sysadmin and fixes problems, at night he tries new technologies like Kubernetes (openSUSE Kubic!), podman or transactional-updates.

 


Haben Sie Anmerkungen oder Nachfragen? Melden Sie sich unter blog%b1-systems.de
Col 2