For few years now, I’ve been running a small Kubernetes cluster for managing a few services, including this blog.
My setup has generally had a strong focus on cost-savings and efficient resource usage, so I generally avoided anything that didn’t have a way to set spending limits or predictable cost. It is just a personal project hobby after all.
After a few iterations of this setup on DigitalOcean and adquiring some dedicated hardware, I eventually moved to hosting the cluster at home.
Setting up and maintaining a Kubernetes cluster is not a trivial task and requires some level of planning. However, once everything is up and running, it’s generally a very pleasant setup to work with.
One part of the puzzle is storage. For clusters hosted on a large provider like AWS or Google Cloud, you generally have a few storage options available like EBS and S3. With my homelab setup, however, I had to look for options that I could run locally.
Rook is a project that I’ve been following and using for a while to accomplish this. It allows you to run a storage layer on your cluster and provides multiple interfaces: Block Storage, Object Storage (i.e. S3-like), and Distributed Filesystems. It does all this by deploying and managing a Ceph cluster for you.
I’ve primarily used it for Block Storage support, but more recently I’ve begun to use it’s Object Storage gateway (RGW), which allows you to consume storage using an S3-compatible API.
When deployed, the default setup provides you with buckets accessible over a
path-based API. Nonetheless, if you have used a block storage solution from a
cloud provider, you’ve likely noticed that they generally provide DNS-based
bucket access (e.g. {bucket_name}.{service_endpoint}
). Similarly, a lot of
S3 clients seem to expect this DNS-based API.
From reading the documentation of Ceph’s RGW, it seemed that there is some level support for this, including serving static websites. So I set out to explore what would it take to get this working with Rook.
I eventually got this working using Rook 1.2 and Ceph Nautilus. Below are some of my notes on some of the steps I took.
DNS
For the DNS records themselves, I started by picking out a name for my storage service, and created two DNS entries:
buckets.chromabits.com
: Just points to one of my ingress nodes.*.buckets.chromabits.com
: A wildcard that also points to my ingress nodes.
Once set up, I verified that that visiting random subdomains resolved correctly to my cluster’s ingress endpoints.
Certificates
In order to serve buckets using HTTPS, I needed a wildcard certificate. Fortunately, this is trivial to set up using cert-manager and LetsEncrypt.
When creating the certificate CRD, I included the wildcard host in the list of
dnsNames
:
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
name: buckets-chromabits-com
namespace: rook-ceph
spec:
secretName: buckets-chromabits-com-tls
issuerRef:
name: letsencrypt-prod
dnsNames:
- buckets.chromabits.com
- '*.buckets.chromabits.com'
Ingress
Next, is routing requests to the RGW service. I had a preexisting ingress set up so the main challenge was figuring out how to handle requests for a wilcard domain.
Upon some initial reading, it seems that Kubernetes ingresses don’t have a way
to handle wilcard domains. However, after skimming through some issues on
GitHub, I learned that it is possible to configure nginx-ingress
to handle
this case.
This is done through the nginx.ingress.kubernetes.io/server-alias
annotation
on the Ingress resource. I set nginx.ingress.kubernetes.io/server-alias
to
'*.buckets.chromabits.com'
in the annotations and it began handling requests
for the subdomains.
Another option here would be to manually modify the ingress every time a new bucket is created, but that doesn’t really scale well and only seems feasible if you only plan to have a small number of buckets.
Rook Configuration
The last step is to configure the RGW to handle requests from these domains.
The documentation mentions that a domain can be set in rgw dns name
in the
daemon’s configuration. Though this didn’t seem like a simple change to
implement using Rook, so I looked for alternatives.
I eventually learned that it is possible to specify one or more hostnames per
zonegroup on the RGW, without having to mess with global settings. So, adding
the hostnames I needed was just a matter of modifying the default
zonegroup.
I deployed the Rook Toolbox container and used
radosgw-admin zonegroup get default
to get the configuration of the default
zonegroup.
I stored the output on a JSON file and modified the JSON object to include a
new hostname in the hostnames
key:
radosgw-admin zonegroup get default > default.json
Once satisfied with the changes, I applied them and restarted the RGW:
radosgw-admin zonegroup set --infile default.json
radosgw-admin period update --commit
After the RGW came back up, buckets began to resolve correctly via DNS! (e.g.
example1.buckets.chromabits.com
).
Phabulous is a server written in Go capable of receiving event notifications from Phabricator, a suite of developer tools and forward them to a Slack community, while also providing additional functionality through a bot interface.
The project started while I was working at Seller Labs and Phabricator was their repository hosting tool. We mainly wanted to have better integration with Slack, just like GitHub and Bitbucket had.
Over time, Seller Labs migrated to GitHub and other tools, so development on Phabulous slowed down a bit since I wasn’t using it on a daily basis any more.
However, this does not mean the project is dead, I’ve quietly been finding some spare time to work on improving Phabulous, and it has received a few contributions through pull requests.
I recently landed a large refactor of the project which should make future contribution and extensions easier. I’ve reorganized how the code is structured to make better use of Go interfaces.
In a perfect world, I would have enough time to write an extensive test suite for the project, but given my limited time, I’ve only been able to cover certain simple part of the project. The transition to interfaces has allowed me to improve the coverage of the project since dependencies can now be easily mocked.
Another side effect that came naturally from this transition was the increased modularity of the code. Want to implement a connector for a different chat protocol? Or do you want to add a new command? Just implement the interfaces.
While still technically in beta, I’m happy to say that Phabulous has reached v3.0.0. With this new release, you can expect the following new features:
- Experimental support for IRC: The bot is now able to connect and work over IRC networks. Functionality is almost on-par with what is available on Slack.
- Modules: Commands and functionality are now split into modules. You can enable/disable them in the configuration file, as well as implementing your own modules when forking the project.
- Improved integration between Slack and Phabricator: Phabricator added a new authentication provider that allows you to sign in with your Slack account. Phabulous makes use of this new integration with a new extension. This extension allows the bot to lookup Slack account IDs over the Conduit API, which means the bot can properly mention users on the chat by using their Slack username rather than their Phabricator username.
- Summon improvements: The summon command can now expand project members if a project is assigned as a reviewer of a revision. Additionally, the lookup algorithm has been optimized to perform less requests on the Conduit API.
- Many other small fixes and improvements.
You can get the latest version of the bot by using Docker or by downloading the latest release on GitHub.
If you have a Linux laptop or desktop with a solid-state drive, and happen to have disk encryption enabled through a LUKS/LVM combo, getting TRIM support enabled isn’t a very straightforward process. It has to be enabled on every IO layer.
In their blog post, How to properly activate TRIM for your SSD on Linux: fstrim, lvm and dm-crypt, Carlos Lopez gives a brief introduction on what TRIM is, and explains why it is beneficial to enable it. The article also describes the steps needed to enable this functionality on each IO layer (dm-crypt, LVM, and the filesystem).
I followed most of this guide for one my own systems, and while I followed
their advice and avoided enabling the discard flag on the filesystem, I never
set up a cron job for running the trim operation periodically. So I found
myself manually executing fstrim
every now and then by hand.
This quickly became slightly repetitive, so I began looking into setting up the automation part. The guide above had an example setup using cron. However, I never set up a cron daemon on my system. So I wondered if it was possible to achieve the same result using systemd.
After reading some documentation on systemd unit files, I learned that is possible to setup timers for your service units, which effectively achieves the same result as a cron daemon.
Below I’m including a fstrim service and timer. The service mainly specifies
which command to run and the timer defines how often it should be executed.
Note that the service unit does not have a WantedBy
option and its Type
is
oneshot
. This means it won’t be automatically executed, and that it is
intended to be a one off command, not a daemon. The timer does have a
WantedBy
option, which will result on it being started at boot.
I can check the status of the timer by using systemctl list-times
and also
run the operation on demands by starting the service unit:
systemctl start fstrim
. The logs are stored on the journal, which can be
queried with journalctl -u fstrim
.
/etc/systemd/system/fstrim.service
This is the service file. Here you can customize how fstrim
is invoked. I use
the a
and v
options, which tell fstrim
to automatically run on every
drive and print verbose output. Additionally, this assumes fstrim
is
installed at /sbin/fstrim
.
[Unit]
Description=Run fstrim on all drives
[Service]
Type=oneshot
ExecStart=/sbin/fstrim -av
User=root
/etc/systemd/system/fstrim.timer
In this configuration, the fstrim
command is executed by root
15 minutes
after booting the machine and weekly afterwards.
[Unit]
Description=Run fstrim on boot and weekly afterwards
[Timer]
OnBootSec=15min
OnUnitActiveSec=1w
[Install]
WantedBy=timers.target