👨🏻‍💻 Thanks for reading! This blog has been archived and won't have content updates. You can follow my latest work at trujillo.io.
CHROMABITS

Eduardo Trujillo
4 minutes

For few years now, I’ve been running a small Kubernetes cluster for managing a few services, including this blog.

My setup has generally had a strong focus on cost-savings and efficient resource usage, so I generally avoided anything that didn’t have a way to set spending limits or predictable cost. It is just a personal project hobby after all.

After a few iterations of this setup on DigitalOcean and adquiring some dedicated hardware, I eventually moved to hosting the cluster at home.

Setting up and maintaining a Kubernetes cluster is not a trivial task and requires some level of planning. However, once everything is up and running, it’s generally a very pleasant setup to work with.

One part of the puzzle is storage. For clusters hosted on a large provider like AWS or Google Cloud, you generally have a few storage options available like EBS and S3. With my homelab setup, however, I had to look for options that I could run locally.

Rook is a project that I’ve been following and using for a while to accomplish this. It allows you to run a storage layer on your cluster and provides multiple interfaces: Block Storage, Object Storage (i.e. S3-like), and Distributed Filesystems. It does all this by deploying and managing a Ceph cluster for you.

I’ve primarily used it for Block Storage support, but more recently I’ve begun to use it’s Object Storage gateway (RGW), which allows you to consume storage using an S3-compatible API.

When deployed, the default setup provides you with buckets accessible over a path-based API. Nonetheless, if you have used a block storage solution from a cloud provider, you’ve likely noticed that they generally provide DNS-based bucket access (e.g. {bucket_name}.{service_endpoint}). Similarly, a lot of S3 clients seem to expect this DNS-based API.

From reading the documentation of Ceph’s RGW, it seemed that there is some level support for this, including serving static websites. So I set out to explore what would it take to get this working with Rook.

I eventually got this working using Rook 1.2 and Ceph Nautilus. Below are some of my notes on some of the steps I took.

DNS

For the DNS records themselves, I started by picking out a name for my storage service, and created two DNS entries:

  • buckets.chromabits.com: Just points to one of my ingress nodes.
  • *.buckets.chromabits.com: A wildcard that also points to my ingress nodes.

Once set up, I verified that that visiting random subdomains resolved correctly to my cluster’s ingress endpoints.

Certificates

In order to serve buckets using HTTPS, I needed a wildcard certificate. Fortunately, this is trivial to set up using cert-manager and LetsEncrypt.

When creating the certificate CRD, I included the wildcard host in the list of dnsNames:

apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: buckets-chromabits-com
  namespace: rook-ceph
spec:
  secretName: buckets-chromabits-com-tls
  issuerRef:
    name: letsencrypt-prod
  dnsNames:
    - buckets.chromabits.com
    - '*.buckets.chromabits.com'

Ingress

Next, is routing requests to the RGW service. I had a preexisting ingress set up so the main challenge was figuring out how to handle requests for a wilcard domain.

Upon some initial reading, it seems that Kubernetes ingresses don’t have a way to handle wilcard domains. However, after skimming through some issues on GitHub, I learned that it is possible to configure nginx-ingress to handle this case.

This is done through the nginx.ingress.kubernetes.io/server-alias annotation on the Ingress resource. I set nginx.ingress.kubernetes.io/server-alias to '*.buckets.chromabits.com' in the annotations and it began handling requests for the subdomains.

Another option here would be to manually modify the ingress every time a new bucket is created, but that doesn’t really scale well and only seems feasible if you only plan to have a small number of buckets.

Rook Configuration

The last step is to configure the RGW to handle requests from these domains.

The documentation mentions that a domain can be set in rgw dns name in the daemon’s configuration. Though this didn’t seem like a simple change to implement using Rook, so I looked for alternatives.

I eventually learned that it is possible to specify one or more hostnames per zonegroup on the RGW, without having to mess with global settings. So, adding the hostnames I needed was just a matter of modifying the default zonegroup.

I deployed the Rook Toolbox container and used radosgw-admin zonegroup get default to get the configuration of the default zonegroup.

I stored the output on a JSON file and modified the JSON object to include a new hostname in the hostnames key:

radosgw-admin zonegroup get default > default.json

Once satisfied with the changes, I applied them and restarted the RGW:

radosgw-admin zonegroup set --infile default.json
radosgw-admin period update --commit

After the RGW came back up, buckets began to resolve correctly via DNS! (e.g. example1.buckets.chromabits.com).

CHROMABITS
Copyright © 2015-2021 - Eduardo Trujillo
Except where otherwise noted, content on this site is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.
Site generated using Gatsby.