Build your own CDN - Part 1: Synch TLS certificates across all your PoPs with Caddy

As a part of building a Content Delivery Network (CDN) for Gitea Pages, I’m documenting my process to share my experience with others. The reason I’m building one from scratch, rather than using a pre-built solution like Amazon CloudFront, is the potential need for many unique TLS certificates. From past experience, I’ve learned that there are limits to how many certificates you can have on a single CloudFront distribution. This is the first post in a series that will document the process of building a CDN from scratch.

The initial task I’ll tackle is synchronizing TLS certificates across all the Points of Presence (PoPs) in the CDN. If you’re unfamiliar with what PoPs are, they are geographically distributed servers located near where end users are to improve content delivery performance. I’ve chosen to use Caddy for this purpose. Caddy is a web server that has built-in support for Let’s Encrypt and can automatically obtain and renew TLS certificates.

While there are alternative approaches using other web servers like nginx, where a central server obtains the TLS certificates and then distributes them to the other servers, Caddy offers a more decentralized approach. I can use a plugin I wrote called certmagic-s3, that lets each instance of Caddy share the TLS certificates via an S3 bucket. This approach has the advantage that any of the PoPs can obtain and renew the TLS certificates, and the other PoPs will automatically receive the updated certificates.

The most challenging part of this approach is to ensure that the plugin is properly installed in Caddy. You could use Caddy’s xcaddy build tool, but Caddy also offers a build service where you can download binaries that have already been compiled. Using that build service, select certmagic-s3 as a plugin to include, download it for your platform of choice, and you’ll have a Caddy binary with the plugin already installed.

Now, using that binary, you can create a Caddyfile to configure Caddy to use the plugin and obtain TLS certificates from Let’s Encrypt. Here’s an example Caddyfile:

{
        email webmaster@example.com  // The email associated with your Let's Encrypt account
        storage s3 {  // Configuring S3 as the storage backend
            host minio.example.com  // Your S3-compatible storage host
            bucket certmagic-s3  // Bucket where certificates will be stored
            access_key ABC123  // Your S3 access key
            secret_key XYZ789  // Your S3 secret key
            prefix "byoc"  // Optional path prefix within the bucket
        }
}
site.example.com {  // Domain to serve
    tls {
        on_demand  // Obtain TLS certificates on first HTTP request instead of on start
    }
    respond "hello world"  // Sample response
}

With the required information filled out from the configuration above, you can start Caddy and it will obtain the TLS certificates from Let’s Encrypt and store them in the S3 bucket. If you then start another instance of Caddy with the same configuration, it will automatically obtain the TLS certificates from the S3 bucket and serve the site.

In upcoming posts I will describe how I setup nomad to distribute Caddy config to all the PoPs, and how I am using Caddy to serve custom dynamic domains for Gitea Pages. Stay tuned for more details on building out the CDN.

Using Nix with Gitea Actions

Carl Sagan once said, “If you wish to make an apple pie from scratch, you must first invent the universe.” In the world of software, creating a reproducible build environment is the universe you need to invent. This post will walk you through using Nix in tandem with Gitea Actions to make that universe a reality for your projects.

I am an enthusiastic user of Nix and am a maintainer of several packages. I appreciate the reproducibility of the binaries it offers across different systems and its rapid update cycle.

Gitea Actions is a CI/CD solution that can run your build and deployment tasks. Using Nix within Gitea Actions is as straightforward as adding a few lines to your workflow file. Here’s how:

name: nix

on:
  push:

jobs:
  lint-backend:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Install dependencies for Nix setup action
        run: |
          apt update -y
          apt install sudo -y                    
      - uses: cachix/install-nix-action@v22
        with:
          nix_path: nixpkgs=channel:nixos-unstable
      - name: Test running command with Nix
        run: nix-shell -p hello --run "hello"

This workflow will install Nix and then execute the hello command. Note that we need to install sudo, as it is a prerequisite for the cachix/install-nix-action and is not present in the default Gitea Actions runner image. If you’re using a custom runner that already has sudo installed, feel free to skip that step.

Regarding the Nix package channel, I prefer to live on the bleeding edge with nixpkgs=channel:nixos-unstable. However, you’re free to pin to a more stable channel if you wish. The cachix/install-nix-action Action does not have a channel configured by default, so you must specify one.

If you haven’t explored Nix yet, I highly recommend you do so. It’s a powerful tool for creating consistent and reproducible build environments.

Watchtowner... but for Kubernetes!?!

Watchtower is an excellent tool for keeping your containers up to date. It’s a process that runs on a schedule and checks for new versions of your containers, and if it finds one, it pulls the new image and recreates the container with the latest image. It’s built for Docker, and it works great for Docker. But what about Kubernetes?

Keel, a Kubernetes operator, also achieves what Watchtower can do but can automate Helm, DaemonSet, StatefulSet & Deployment updates. It also has a friendly UI to see the status of the updates it is managing.

Installing Keel

The first step to utilizing Keel is installing it in your Kubernetes cluster. You can use Helm, the Kubernetes package manager, for this purpose:

export KEEL_NAMESPACE=keel
export KEEL_ADMIN_USER=keel
export KEEL_ADMIN_PASS=keel
kubectl apply -f https://sunstone.dev/keel?namespace=$KEEL_NAMESPACE\&username=$KEEL_ADMIN_USER\&password=$KEEL_ADMIN_PASS\&tag=latest

Configuring your Deployments for Keel

Once you have installed Keel, you’ll need to configure your deployments to use it. This is as simple as adding a few annotations to your Kubernetes deployment specifications. Keel uses SemVer (Semantic Versioning), and its policies can be all, major, minor, or patch. For example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  labels:
    keel.sh/policy: major
...

The above configuration means Keel will update the deployment whenever there’s a new major version of the container image.

Handling private images and rate limits

If you’re using private images or Docker Hub with its strict rate limit, you’ll need to configure Keel to authenticate with your registry. Keel also supports secrets for pulling images. Keel will use existing secrets that Kubernetes uses to pull the image so no additional configuration required.

Keel UI

One of the unique features of Keel is its UI which allows you to see at a glance the status of your deployments and any updates it’s managing. You can access it via a Kubernetes ingress or use kubectl port-forward:

kubectl -n keel port-forward service/keel 9300 

Wrapping Up

Keel is a powerful tool that brings the simplicity and automation of Watchtower to the Kubernetes ecosystem. Whether you have simple Deployments, use Helm, or have more complex DaemonSets or StatefulSets, Keel has you covered.

Remember, automating your image updates saves you time and ensures that you’re running the latest and potentially more secure version of your containers. As always, it’s essential to have robust rollback strategies and test pipelines in place, especially when using automatic updates.

Using Bunny.net to host static sites

Bunny.net (formerly BunnyCDN) is a low-cost, high-performance CDN provider that can be used to host static sites. This post will walk through the steps to hosting a static site using it.

Creating a Storage Zone

The first step is to create a storage zone. The storage zone is where the static site will be stored. To do this, log into the Bunny.net dashboard, click the Storage Zones tab, and follow the steps to create a new storage zone. The storage zone can be named anything, but it’s best to name something that will allow you to identify it later and associate it with your site quickly. You can select the regions you would like your content replicated to, and the more regions you have, the faster your site will be in those regions. Be careful; the more regions you select, the more it will cost you.

Now that you have a storage zone, please navigate to the FTP credentials page, and have them ready for later.

Creating a Pull Zone

The next step is to create a pull zone. The pull zone is what will be used to serve the static site. To do this, log into the Bunny.net dashboard, click the Pull Zones tab, and follow the steps to create a new pull zone. As with the storage zone, you should name it something memorable. You can also select the regions from which the CDN serves your data. The pull zone must be configured to use the storage zone you created earlier as the “origin”. You can also enable a custom domain for your site. If you do, you will need to add a CNAME record to your DNS provider that points to the Bunny.net pull zone and add it to the pull zone in the Bunny.net dashboard.

Uploading the Site

Now that you have a storage zone and a pull zone, you can upload your site to the storage zone. You can do this using the FTP credentials you created earlier. Here is an example of how to do this using Gitea Actions, but you can use any CI system you like.

# .gitea/workflows/hugo-build.yml
name: Build and Deploy to BunnyCDN

on:
  push:
    branches:
      - main

jobs:
  bunnycdn:
    name: bunnycdn-publish
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
        with:
          submodules: true  # Fetch Hugo themes (true OR recursive)
          fetch-depth: 0    # Fetch all history for .GitInfo and .Lastmod

      - name: Setup Hugo
        uses: https://github.com/peaceiris/actions-hugo@v2
        with:
          hugo-version: '0.111.3'
          extended: true

      - name: Build
        run: hugo --minify

      - name: Deploy to BunnyCDN
        run: |
          apt-get update
          apt-get install -y lftp
          lftp -e "
            set ftp:ssl-allow true;
            set ftp:ssl-protect-data true;
            set ssl:verify-certificate no;
            open ${{ secrets.BUNNYCDN_FTP_HOST }};
            user ${{ secrets.BUNNYCDN_FTP_USER }} ${{ secrets.BUNNYCDN_FTP_PASSWORD }};
            mirror -R -v public/ .;
            bye;
          "          

You’ll need to set the BUNNYCDN_FTP_HOST and BUNNYCDN_FTP_USER secrets, they are the FTP host and user from the storage zone’s FTP credentials page. The BUNNYCDN_FTP_PASSWORD is the password from the same page. The public/ directory is the directory that Hugo builds the site into. The / is the root of the storage zone. The -R flag tells lftp to mirror the directory recursively, and the -v flag tells lftp to be verbose.

Conclusion

That’s it! You should now have a static site hosted on Bunny.net. You can now use the pull zone’s URL to access your site. If you enabled a custom domain, you can use that instead. You can also further configure the pull zone to enable caching, compression, and other features such as auto-https.

Notes: Bunny.net is rebuilding its dashboard, so the steps in this post may not match the current dashboard, although the process should remain largely the same.

Secure SSH Access with SSH Certificates Managed by HashiCorp's Vault

Warning: This post describes a non-production setup of Vault. As such, it is not hardened with appropriate security measures, and it is not recommended to use this setup in production. You should use this for learning purposes on SSH CAs and Vault.

SSH certificates are an effective way to secure SSH server access. They can restrict users and the commands they can run, making them especially valuable for managing access to multiple servers. By using SSH certificates, server fingerprint validation becomes unnecessary since the certificates are signed by a Certificate Authority (CA) with the CA’s public key installed on the server. Vault is an excellent tool for managing SSH certificates, offering functionalities like issuing and revoking certificates, managing SSH keys, and providing audit logs.

Install Vault

To quickly set up a development Vault server, use the official Docker image with the following command:

docker run --cap-add=IPC_LOCK -e 'VAULT_LOCAL_CONFIG={"storage": {"file": {"path": "/vault/file"}}, "listener": [{"tcp": { "address": "0.0.0.0:8200", "tls_disable": true}}], "default_lease_ttl": "168h", "max_lease_ttl": "720h", "ui": true}' -p 8200:8200 hashicorp/vault server

Configure SSH Certificate Authority

With Vault installed and running, configure it to issue SSH certificates using the SSH secrets engine.

First, generate an SSH key pair (private and public keys) to act as the SSH Certificate Authority (CA) for Vault using the ssh-keygen command:

ssh-keygen -t rsa -b 4096 -f ssh_ca_key -C "Vault SSH CA"

You will now have two files: the private key ssh_ca_key and the public key ssh_ca_key.pub. Vault will use the private key to sign the SSH certificates, while clients will use the public key to verify the SSH certificates.

Enable and configure the SSH secrets engine to use the generated public key as the CA:

  1. Log in to Vault with vault login <initial_root_token>.
  2. Enable the SSH secrets engine with vault secrets enable ssh.
  3. Configure the SSH secrets engine to use the generated keys:
vault write ssh/config/ca \
    private_key=@ssh_ca_key \
    public_key=@ssh_ca_key.pub

Create a role called ops-team to issue SSH certificates. This role allows any user with access to request an SSH certificate. The example below grants broad permissions, including any option for allowed_users and port forwarding. Be sure to restrict these permissions based on your use case.

vault write ssh/roles/ops-team \
    key_type=ca \
    ttl=2h \
    max_ttl=24h \
    allow_user_certificates=true \
    allowed_users="*" \
    default_extensions='permit-pty,permit-port-forwarding,permit-agent-forwarding'

Configure the remote server to accept the SSH certificates issued by Vault:

  1. Copy the CA public key to the remote server:
scp ssh_ca_key.pub <username>@<target_server_ip>:/tmp/ssh_ca_key.pub
  1. Add the public key to the OpenSSH configuration and restart the OpenSSH daemon:
echo "TrustedUserCAKeys /etc/ssh/user_ca.pub" | sudo tee -a /etc/ssh/sshd_config
sudo cp /tmp/ssh_ca_key.pub /etc/ssh/user_ca.pub
sudo systemctl restart sshd

Requesting SSH Certificates

To request an SSH certificate from Vault and use it to SSH into the remote server, follow these steps:

  1. Use the ops-team role to request the certificate and pass your local SSH key id_rsa.pub. Also, specify the username to use when connecting to the remote server:
vault write -field=signed_key ssh/sign/ops-team \
    public_key=@$HOME/.ssh/id_rsa.pub \
    valid_principals="<username>" > signed_id_rsa-cert.pub
  1. Use the signed_id_rsa-cert.pub file to SSH into the remote server:
ssh -i signed_id_rsa-cert.pub -i $HOME/.ssh/id_rsa <username>@<target_server_ip>

Requesting a signed certificate manually each time can be tedious. To simplify this process, create a script called vault-ssh.sh and make it executable with chmod +x vault-ssh.sh.

#!/bin/bash

# Configuration
VAULT_ADDR="http://<vault_server_ip>:8200"
VAULT_ROLE="ops-team"
USERNAME="<username>"
PUBLIC_KEY_PATH="$HOME/.ssh/id_rsa.pub"
CERT_PATH="$HOME/.ssh/id_rsa-cert.pub"
CONFIG_FILE="path/to/vault-creds.conf"

# Read the Vault token from the configuration file
if [ -f "$CONFIG_FILE" ]; then
  source "$CONFIG_FILE"
else
  echo "Error: Vault configuration file not found"
  exit 1
fi

# Check if the VAULT_TOKEN variable is set
if [ -z "$VAULT_TOKEN" ]; then
  echo "Error: VAULT_TOKEN is not set in the configuration file"
  exit 1
fi

# Generate a new SSH certificate
vault write -field=signed_key ssh/sign/$VAULT_ROLE \
  public_key=@$PUBLIC_KEY_PATH \
  valid_principals="$USERNAME" > $CERT_PATH

This script requires a vault-creds.conf file containing the Vault token:

VAULT_TOKEN=<vault_token>

To integrate the certificate generation process with your SSH config, use the ProxyCommand configuration option, which allows you to run a custom command (like the script) as a “proxy” for the actual SSH connection.

Add the following to your SSH config:

Host *
  IdentityFile ~/.ssh/id_rsa
  CertificateFile ~/.ssh/id_rsa-cert.pub
  ProxyCommand bash -c 'path/to/valt-ssh.sh && nc %h %p'

Keep in mind that this approach generates a new SSH certificate for every connection. Depending on the frequency of your connections and the TTL of your certificates, you might want to modify the vault-ssh.sh script to check the current certificate’s validity and generate a new one only if necessary.

Conclusion

This post covered configuring Vault to issue SSH certificates, setting up a remote server to accept these certificates, and streamlining the process of requesting SSH certificates. Use the knowledge from this post to enhance your environment’s security. Remember that this post describes a non-production setup of Vault and should be used for learning purposes only.

Credits: The above post was written from knowledge and experience of using Vault. The instructions for docker configuration of vault are from the official Vault docker documentation.

Playing around with Gitea Actions on Fly.io

Fly.io is a “serverless” hosting platform usually used to host web services. It can be used for more than just web services; it can also be used to run long-running tasks. I wanted to try a new way to run the Gitea Actions runner, and Fly.io would be interesting way to try out.

Treat this as a proof of concept, I’m unsure if this is a good idea, but it’s fun to try out.

To simplify things, I will run the runner in “host” mode, meaning that each job won’t be containerized but will run directly on the host. This was a decision made before getting started to limit the amount of debugging sorting out Docker in Docker.

To get started I creted a new Fly.io app with the following configuration:

# fly.toml
app = "actions-on-fly"
primary_region = "ams"

[[mounts]]
  destination = "/data"
  source = "data"

I mounted a persistent volume to /data so that the runner can be registered and persist the registration token across restarts.

Since there are no prebuilt Docker images (as of the time of publishing) I created one and installed the runner in it. The Dockerfile is as follows:

# Dockerfile
FROM ghcr.io/catthehacker/ubuntu:act-latest
# the FROM image is based on ubuntu and has appropriate tools installed to run Gitea Actions

# install act_runner
RUN curl https://dl.gitea.com/act_runner/nightly/act_runner-nightly-linux-amd64 > /usr/local/bin/act_runner && \
    chmod +x /usr/local/bin/act_runner

# add start script
ADD start.sh /start.sh
RUN chmod +x /start.sh && mkdir -p /data
ENTRYPOINT ["/start.sh"]

When running the container, the startup logic will check if the runner is already registered, and if not it will register it. The registration token is passed in as an environment variable. The runner will then be started.

#!/bin/bash
# start.sh

# $ACTIONS_REGISTER_TOKEN is the registration token for the runner that is given by the Gitea runner settings page.

# set /data as the working dir
cd /data

# check if runner is already registered, and if not register it
if [ ! -f .runner ]; then
  # register runner on gitea.com, and set label as fly-runner so it runs as "host" mode
  act_runner register --no-interactive --instance "https://gitea.com" --labels "fly-runner" --token $ACTIONS_REGISTER_TOKEN
fi

# start runner
act_runner daemon

It really was a handful of lines to get a runner up and running. The runner is now running on Fly.io and can be used to run Gitea Actions. The only issue I ran into when setting this up, was that Fly.io will terminate apps if they run out of memory. This is a problem because what I was testing used a lot of memory. I ended up increasing the memory limit. Maybe a different hosting would handle OOMs differently, but I was pretty satisfied with the result. The blog you are reading right now is built using this runner.