👨🏻‍💻 Thanks for reading! This blog has been archived and won't have content updates. You can follow my latest work at trujillo.io.

CHROMABITS

Rook: Adding bucket DNS to Ceph RGW

Eduardo Trujillo

Feb 17th, 2020

4 minutes

For few years now, I’ve been running a small Kubernetes cluster for managing a few services, including this blog.

My setup has generally had a strong focus on cost-savings and efficient resource usage, so I generally avoided anything that didn’t have a way to set spending limits or predictable cost. It is just a personal project hobby after all.

After a few iterations of this setup on DigitalOcean and adquiring some dedicated hardware, I eventually moved to hosting the cluster at home.

Setting up and maintaining a Kubernetes cluster is not a trivial task and requires some level of planning. However, once everything is up and running, it’s generally a very pleasant setup to work with.

One part of the puzzle is storage. For clusters hosted on a large provider like AWS or Google Cloud, you generally have a few storage options available like EBS and S3. With my homelab setup, however, I had to look for options that I could run locally.

Rook is a project that I’ve been following and using for a while to accomplish this. It allows you to run a storage layer on your cluster and provides multiple interfaces: Block Storage, Object Storage (i.e. S3-like), and Distributed Filesystems. It does all this by deploying and managing a Ceph cluster for you.

I’ve primarily used it for Block Storage support, but more recently I’ve begun to use it’s Object Storage gateway (RGW), which allows you to consume storage using an S3-compatible API.

When deployed, the default setup provides you with buckets accessible over a path-based API. Nonetheless, if you have used a block storage solution from a cloud provider, you’ve likely noticed that they generally provide DNS-based bucket access (e.g. {bucket_name}.{service_endpoint}). Similarly, a lot of S3 clients seem to expect this DNS-based API.

From reading the documentation of Ceph’s RGW, it seemed that there is some level support for this, including serving static websites. So I set out to explore what would it take to get this working with Rook.

I eventually got this working using Rook 1.2 and Ceph Nautilus. Below are some of my notes on some of the steps I took.

DNS

For the DNS records themselves, I started by picking out a name for my storage service, and created two DNS entries:

buckets.chromabits.com: Just points to one of my ingress nodes.
*.buckets.chromabits.com: A wildcard that also points to my ingress nodes.

Once set up, I verified that that visiting random subdomains resolved correctly to my cluster’s ingress endpoints.

Certificates

In order to serve buckets using HTTPS, I needed a wildcard certificate. Fortunately, this is trivial to set up using cert-manager and LetsEncrypt.

When creating the certificate CRD, I included the wildcard host in the list of dnsNames:

apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: buckets-chromabits-com
  namespace: rook-ceph
spec:
  secretName: buckets-chromabits-com-tls
  issuerRef:
    name: letsencrypt-prod
  dnsNames:
    - buckets.chromabits.com
    - '*.buckets.chromabits.com'

Ingress

Next, is routing requests to the RGW service. I had a preexisting ingress set up so the main challenge was figuring out how to handle requests for a wilcard domain.

Upon some initial reading, it seems that Kubernetes ingresses don’t have a way to handle wilcard domains. However, after skimming through some issues on GitHub, I learned that it is possible to configure nginx-ingress to handle this case.

This is done through the nginx.ingress.kubernetes.io/server-alias annotation on the Ingress resource. I set nginx.ingress.kubernetes.io/server-alias to '*.buckets.chromabits.com' in the annotations and it began handling requests for the subdomains.

Another option here would be to manually modify the ingress every time a new bucket is created, but that doesn’t really scale well and only seems feasible if you only plan to have a small number of buckets.

Rook Configuration

The last step is to configure the RGW to handle requests from these domains.

The documentation mentions that a domain can be set in rgw dns name in the daemon’s configuration. Though this didn’t seem like a simple change to implement using Rook, so I looked for alternatives.

I eventually learned that it is possible to specify one or more hostnames per zonegroup on the RGW, without having to mess with global settings. So, adding the hostnames I needed was just a matter of modifying the default zonegroup.

I deployed the Rook Toolbox container and used radosgw-admin zonegroup get default to get the configuration of the default zonegroup.

I stored the output on a JSON file and modified the JSON object to include a new hostname in the hostnames key:

radosgw-admin zonegroup get default > default.json

Once satisfied with the changes, I applied them and restarted the RGW:

radosgw-admin zonegroup set --infile default.json
radosgw-admin period update --commit

After the RGW came back up, buckets began to resolve correctly via DNS! (e.g. example1.buckets.chromabits.com).

Phabulous 3

Eduardo Trujillo

Mar 5th, 2017

2 minutes

Phabulous is a server written in Go capable of receiving event notifications from Phabricator, a suite of developer tools and forward them to a Slack community, while also providing additional functionality through a bot interface.

The project started while I was working at Seller Labs and Phabricator was their repository hosting tool. We mainly wanted to have better integration with Slack, just like GitHub and Bitbucket had.

Over time, Seller Labs migrated to GitHub and other tools, so development on Phabulous slowed down a bit since I wasn’t using it on a daily basis any more.

However, this does not mean the project is dead, I’ve quietly been finding some spare time to work on improving Phabulous, and it has received a few contributions through pull requests.

I recently landed a large refactor of the project which should make future contribution and extensions easier. I’ve reorganized how the code is structured to make better use of Go interfaces.

In a perfect world, I would have enough time to write an extensive test suite for the project, but given my limited time, I’ve only been able to cover certain simple part of the project. The transition to interfaces has allowed me to improve the coverage of the project since dependencies can now be easily mocked.

Another side effect that came naturally from this transition was the increased modularity of the code. Want to implement a connector for a different chat protocol? Or do you want to add a new command? Just implement the interfaces.

While still technically in beta, I’m happy to say that Phabulous has reached v3.0.0. With this new release, you can expect the following new features:

Experimental support for IRC: The bot is now able to connect and work over IRC networks. Functionality is almost on-par with what is available on Slack.
Modules: Commands and functionality are now split into modules. You can enable/disable them in the configuration file, as well as implementing your own modules when forking the project.
Improved integration between Slack and Phabricator: Phabricator added a new authentication provider that allows you to sign in with your Slack account. Phabulous makes use of this new integration with a new extension. This extension allows the bot to lookup Slack account IDs over the Conduit API, which means the bot can properly mention users on the chat by using their Slack username rather than their Phabricator username.
Summon improvements: The summon command can now expand project members if a project is assigned as a reviewer of a revision. Additionally, the lookup algorithm has been optimized to perform less requests on the Conduit API.
Many other small fixes and improvements.

You can get the latest version of the bot by using Docker or by downloading the latest release on GitHub.

Automatically TRIM-ing your SSD using systemd

Eduardo Trujillo

Jul 30th, 2016

2 minutes

If you have a Linux laptop or desktop with a solid-state drive, and happen to have disk encryption enabled through a LUKS/LVM combo, getting TRIM support enabled isn’t a very straightforward process. It has to be enabled on every IO layer.

In their blog post, How to properly activate TRIM for your SSD on Linux: fstrim, lvm and dm-crypt, Carlos Lopez gives a brief introduction on what TRIM is, and explains why it is beneficial to enable it. The article also describes the steps needed to enable this functionality on each IO layer (dm-crypt, LVM, and the filesystem).

I followed most of this guide for one my own systems, and while I followed their advice and avoided enabling the discard flag on the filesystem, I never set up a cron job for running the trim operation periodically. So I found myself manually executing fstrim every now and then by hand.

This quickly became slightly repetitive, so I began looking into setting up the automation part. The guide above had an example setup using cron. However, I never set up a cron daemon on my system. So I wondered if it was possible to achieve the same result using systemd.

After reading some documentation on systemd unit files, I learned that is possible to setup timers for your service units, which effectively achieves the same result as a cron daemon.

Below I’m including a fstrim service and timer. The service mainly specifies which command to run and the timer defines how often it should be executed. Note that the service unit does not have a WantedBy option and its Type is oneshot. This means it won’t be automatically executed, and that it is intended to be a one off command, not a daemon. The timer does have a WantedBy option, which will result on it being started at boot.

I can check the status of the timer by using systemctl list-times and also run the operation on demands by starting the service unit: systemctl start fstrim. The logs are stored on the journal, which can be queried with journalctl -u fstrim.

/etc/systemd/system/fstrim.service

This is the service file. Here you can customize how fstrim is invoked. I use the a and v options, which tell fstrim to automatically run on every drive and print verbose output. Additionally, this assumes fstrim is installed at /sbin/fstrim.

[Unit]
Description=Run fstrim on all drives

[Service]
Type=oneshot
ExecStart=/sbin/fstrim -av
User=root

/etc/systemd/system/fstrim.timer

In this configuration, the fstrim command is executed by root 15 minutes after booting the machine and weekly afterwards.

[Unit]
Description=Run fstrim on boot and weekly afterwards

[Timer]
OnBootSec=15min
OnUnitActiveSec=1w

[Install]
WantedBy=timers.target

…or you can find more in the archives.

CHROMABITS

Except where otherwise noted, content on this site is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

Site generated using Gatsby.