A missing blog post image

Introduction

Some weeks ago, while I was upgrading my server operating system, I had to reboot the machine to load the new kernel. Unfortunately, the machine actually wasn’t feeling well and decided not to reboot. After contacting the data center support, it appeared the chassis had a critical hardware failure which prevented it from booting again.

This event taught me several things :

  • reboot is not an innocuous operation (in my very case : IPMI fails very quickly after power on, and serial console was not reading anything interesting) ;

  • hardware issues may have nothing to do with current runtime (no kernel warning popped out recently) ;

  • backups are primordial (you can’t imagine how the last backup I’ve run some days before the upgrade reassured me during this period) ;

  • SLA means something, including for IaaS (in my very case : probably due to a blade containing multiple servers, data center operators couldn’t access the chassis to get the disk out without impacting other clients. So they did not, according to the SLA).

On our “new” machine, I had to go through re-setup Web server(s) and, among other things, TLS certificates.

Do I really need to introduce you to EFF’s Certbot, which you are very likely already using to obtained HTTPS certificates from Let’s Encrypt ? I guess not, ‘cause you wouldn’t be reading this blog post if you know nothing about it.

So what is the link between your server crash and Certbot ?

Well, this time I decided not to grant administrator privileges to this piece of software, and we’ll see how we can achieve that.

Installation

Official setup procedure recommends to go through Canonical’s snapd software to deploy Certbot, but I tend to reject these approaches, mostly when it’s about running interpreted code (and not compiled C/C++ programs, which can require several libraries loaded at runtime, which can “justify” [please note the quotes] shipping tons of BLOBs to ease deployment among heterogeneous systems).

A missing blog post image

As Certbot is still distributed through PyPI, we’ll go this way.

apt install -y python3-venv
adduser --system certbot --home /opt/certbot
su - certbot -s /bin/bash

# As certbot user :
python3 -m venv venv && source venv/bin/activate
pip3 install -U pip wheel
pip3 install certbot

Asking for a certificate

So there is two ways to ask for an HTTPS certificate : either Certbot spawns an HTTP Web server and directly responds to CA’s http-01 challenge, or it could write to an already “served” HTTP Web root.

The standalone http-01 server way

The idea here is to make Certbot bind a local port > 1024, redirect new HTTP traffic to this port and let it directly respond to the http-01 challenge (as it were the actual Web server behind your domain name/IP address).

For a certificate that has been asked for the first time this way (as certbot user) :

venv/bin/certbot \
    --work-dir=/opt/certbot \
    --logs-dir=/opt/certbot/logs \
    --config-dir=/opt/certbot/config \
    certonly \
        --http-01-address 127.0.0.1 \
        --http-01-port 8080 \
        -d "your.domain.name"

… I propose you below Bash renewal script (mainly the detailed steps to adapt with your setup) :

#!/usr/bin/env bash

set -euo pipefail

DOMAIN="your.domain.name"
WAN_IP_ADDRESS="your.wan.ip.address"
WAN_NETWORK_INTERFACE="eth0"
HTTP_01_LOCAL_ADDR="127.0.0.1"
HTTP_01_LOCAL_PORT=8080

# 1. Some firewall rules to DNAT and ACCEPT (new) HTTP traffic to local http-01 port
dnat_rule_handle="$(nft -ej insert rule ip nat prerouting ip daddr "$WAN_IP_ADDRESS" tcp dport http ct state new dnat to "${HTTP_01_LOCAL_ADDR}:${HTTP_01_LOCAL_PORT}" | grep -vE '^#' | jq -r .nftables[0].insert.rule.handle)"
filter_rule_handle="$(nft -ej insert rule inet filter input ip daddr "$HTTP_01_LOCAL_ADDR" tcp dport "$HTTP_01_LOCAL_PORT" ct state new accept | grep -vE '^#' | jq -r .nftables[0].insert.rule.handle)"

# 2. Allow DNAT to loopback
sysctl -q -w "net.ipv4.conf.${WAN_NETWORK_INTERFACE}.route_localnet=1"

# Renew the certificate using Certbot (`|| true` is required to allow it to fail, "for reasons")
su - certbot -s /bin/bash -c \
    "/opt/certbot/venv/bin/certbot --work-dir=/opt/certbot --logs-dir=/opt/certbot/logs --config-dir=/opt/certbot/config renew -q" \
    || true

# 3. Disallow DNAT to loopback
sysctl -q -w "net.ipv4.conf.${WAN_NETWORK_INTERFACE}.route_localnet=0"

# 4. (situational) Install cryptographic materials where they need to be
cp "/opt/certbot/config/live/${DOMAIN}/fullchain.pem" /path/to/fullchain.pem
cp "/opt/certbot/config/live/${DOMAIN}/privkey.pem" /path/to/privkey.pem

# 5. (situational) Restart the service(s) to load the new certificate(s)
systemctl restart apache2.service

# 6. Delete our temporary firewall rules
nft delete rule inet filter input handle "$filter_rule_handle" || true
nft delete rule ip nat prerouting handle "$dnat_rule_handle" || true

For this script to work, I assume :

  • jq is available on the system (used to parse nft JSON output) ;

  • The firewall is managed through nftables ;

  • (nftables) tables ip nat and inet filter exist ;

  • (nftables) chains prerouting (ip nat) and input (inet filter) exist.

Note : http-01 server configuration is stored by Certbot so we don’t have to specify --http-01-* arguments during renewal.

The already “served” HTTP Web root

This is the method I’d prefer, as we don’t have to play with firewall.

First, you will have to tweak your Web server configuration (i.e. the default VHOST) to :

  1. Disable HTTPS redirection for .well-known URIs (if any) ;

  2. Allows access to .well-known/acme-challenge Web root (if restricted).

Below, for instance, Apache httpd configuration :

<VirtualHost *:80>
	ServerName your.domain.name

	# Certbot
	DocumentRoot /var/www
	<Directory /var/www/.well-known/acme-challenge>
		Require all granted
	</Directory>

	RewriteEngine on
	RewriteCond %{REQUEST_URI} !^/\.well-known
	RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>

Now, you can run the following commands :

# Prepare the Web root
mkdir -p /var/www/.well-known/acme-challenge
chown www-data:www-data /var/www/.well-known
chown certbot:www-data /var/www/.well-known/acme-challenge

# Reload Apache httpd configuration
a2enmod rewrite
systemctl restart apache2.service

# Ask for a certificate
su - certbot -s /bin/bash -c \
    "/opt/certbot/venv/bin/certbot --work-dir=/opt/certbot --logs-dir=/opt/certbot/logs --config-dir=/opt/certbot/config certonly --webroot-path=/var/www -d 'your.domain.name'

A typical renewal procedure would then be :

su - certbot -s /bin/bash -c \
    "/opt/certbot/venv/bin/certbot --work-dir=/opt/certbot --logs-dir=/opt/certbot/logs --config-dir=/opt/certbot/config renew --deploy-hook 'touch certs_renewed' -q"

if [ -f /opt/certbot/certs_renewed ]; then
	systemctl restart apache2.service
	rm -f /opt/certbot/certs_renewed
fi

Note : Web root path configuration is stored by Certbot so we don’t have to specify --webroot-path argument during renewal.

The trick with --deploy-hook is required as Certbot exits with status code 0 on “success” (i.e. either when zero, one or multiple certificates got renewed). @iquito’s workaround is thus required here if we want to prevent unconditional Web server restart.

Conclusion

Do backup. Try your restoration procedure. Encrypt the world. Get rid of unnecessary privileges. KISS.