haproxy certificate reloading

This post continues the discussion on dynamic SSL certificate reloading when a certificate is renewed. I already discussed keycloak, this post is about HAProxy.

I use HAProxy for an ingress controller in my Docker swarm, much how traefik is used as an ingress controller for Kubernetes. I have both a Docker swarm and a Kubernetes cluster, but I prefer to do development in Swarm, though there are some nice features of Kubernetes. My k8s environment is a cluster of k3os VM instances running Rancher and Longhorn. I also have cert manager configured, which seemed like a simple no-effort solution to managing SSL certificates.

Cert manager offers a few things that conventional solutions don’t, first it can obtain new certificates automatically and renew existing certificates automatically. On Swarm I use Certbot for renewal, but I still have to request them manually. Cert manager works with the ingress controller to automatically direct any ACME verification requests to the cert manager, so that is totally transparent. You can do the same thing with HAProxy and Certbot. I use rfc2136 with a DNS server to manage my ACME challenges instead of an HTTP based challenge.

While cert manager is a nicely integrated solution, it falls down in 1 area of interest to me: dynamic reloading of certificates. While cert manager can map certificates as secrets into your pods (containers), there is no mechanism to dynamically reload the certificate. The intent is to restart pods, but then we get back to the issues I raised in on container lifetimes.

HAProxy has a simple solution for managing certificate instances with zero downtime: it stores the SSL certificates in an in-memory transactional data store. When you want to replace an old certificate with a new certificate you simply need to create a transaction, load the assets, then commit the transaction to overwrite the old certificate.

There are a couple choices that impact how you store certificates: whether you want a combined certificate that contains the full chain and private key, or if you want to store certificate assets in separate files like how the certificate manager does (Certbot, etc). I initially stored certificates in separate certificate and key files, but later I combined them into a single file to make the automation simpler.

The exact process for replacing an old certificate with a new one is rather simple:

echo -e "set ssl cert /path/to/cert.pem <<\n$(cat cert.pem)\n" | socat tcp-connect: -
echo "commit ssl cert /path/to/cert.pem" | socat tcp-connect: -

The above is taken from an example and is pretty easy to implement, however in my environment there are some additional considerations. I have multiple servers that run HAProxy as an ingress controller, these servers listen on incoming IP addresses and act as proxies to containerized services running on the Swarm.

I run HAProxy as a Docker stack (service) and use the global deployment mode with node constraints. This means I have multiple instances of the container named haproxy, which results in a round-robin load balancing effect when you connect to the host named haproxy. Docker doesn’t do round-robin via DNS, it does it at the IP address level — the service DNS always returns the same IP address but subsequent connections ping-pong between the container instances.

The way to enumerate all running instances of a container is to query the DNS name tasks.<service>, this returns a list of different hostnames and IP addresses that can be used to access specific instances of the service (HAProxy).

I have a single instance of notify_watcher that watches the SSL certificate directory of a local docker volume, when the certificates change it will run a script that connects to each HAProxy instance in the stack and install the new certificates. I consider this a clean approach because there is a single monitoring program and it can manage 1 or more HAProxy instances without any fuss. There are no race conditions in the design because the SSL certificates are read from the node that the notify_watcher instance is running on, it doesn’t matter if the certificates have propagated to all local copies of the git repository holding the certificates.

That brings me to the next part I’ll discuss: renewing and managing SSL certificates across a Docker Swarm using local git volume stores on each node.

By Phantom

Coder, sysadmin, maker, human

Leave a Reply