Eduardo Trujillo

At my current job, we are starting to adopt Haskell to write some of our backend APIs and, like other projects at the company, we are using Docker for deploying them to staging and production environments.

Working with Haskell required some re-thinking on how we do certain parts of our workflow, but it also led to a much improved image size, with some extra room for improvement.

Unlike PHP or JavaScript, Haskell is a compiled language, meaning that some of previous approaches we were using for building Docker images did not port that well.

For dynamic languages, we followed roughly these steps to build our images:

  • Use a base image with packages for the software we need (Debian, Alpine).
  • Install server software, such as Nginx, Node.js, and PHP.
  • Install language tools and package managers.
  • Copy application sources.
  • Download dependencies.
  • Final optimizations and cleanup.

This is a hybrid approach where we have both build and runtime dependencies on the same Docker image, which is useful because image builders such as Quay.io can automatically build the docker image for you on every commit without the need of an additional step in the CI pipeline, and this also has the slight advantage of having enough tooling inside a container for light debugging.

As a result, the image size is not the smallest it could be since there is a lot of things that are not commonly used during runtime. However, the added convenience sort of out weighs the issues of having a larger image for these cases.

For Haskell projects, though, things are bit different, and not in a bad way:

The community has made an excellent build tool called Stack. stack takes care of mostly everything related to setting up your project: Installing GHC, pulling dependencies, building, testing, coverage reports, documentation. When paired with Nix, it can even pull non-Haskell dependencies for reproducible builds.

Stack takes care of mostly everything related to setting up your project.

If we try to do a hybrid image approach like above using Stack, we mainly have to do the following on a Dockerfile:

  • Download and install Stack.
  • Copy application sources.
  • Install GHC (stack setup).
  • Compile project (stack build).

This works, but it is extremely slow and the resulting images are huge (+800MB!).

On every Docker build, stack would have to download and install GHC, and then proceed to download and compile every dependency of the project, which tended to a good 30 minutes on a Quay.io worker node.

When developing locally, you only have to go through this process every now and then because most of it is cached in directories such as ~/.stack and .stack-work.

Looking for faster build times and smaller images, I decided to experiment with splitting the build and runtime aspects of the project.

The build part was already setup since we were using Travis CI for running unit and some integration tests. When compared to basic Docker image builders, Travis CI has the clear benefit of having a cache that can be reused across builds without too much work. This cache allowed us to keep our built dependencies across builds, which reduced the build time to under 5 minutes.

This enable caching of Stack builds, you just need to add the work directories to the cache section of .travis.yml:

cache:
  directories:
    - "$HOME/.stack"
    - .stack-work

So, getting the runtime half working meant taking the resulting build from the previous steps and building a Docker container with enough dependencies and data files for running the application.

FPComplete has a great article out there on how to create minimal Docker images for Haskell projects.

The main difficulty with this process is that Haskell programs are not built statically by default, meaning that you have to identify all the libraries the binary is linked against and include them in the final runtime image.

The main difficulty with this process is that Haskell programs are not built statically…

In order to keep things simple, I decided to stick to using a base image, which we could use to pull in any dependencies we don’t have, like libcurl-dev.

I initially tried to use Alpine, since its known for being one of the smallest images out there. However, getting a Haskell program running in it was not trivial since it requires cross-compiling GHC.

So I settled with debian, which is a larger image, but has almost everything we need out of the box.

Building a Docker image on Travis CI is a fairly simple process. Pushing it to a registry and correctly tagging it was the annoying part. After a couple of hours of trial and error, I made a small shell script for authenticating with the registry and pushing a tagged image matching the current git tag and branch.

This script is called on the after-success step of the Travis CI build:

#!/bin/bash
set -euo pipefail
IFS=$'\n\t'

docker build -t myapp .

# If this is not a pull request, update the branch's docker tag.
if [ $TRAVIS_PULL_REQUEST = 'false' ]; then
  docker tag myapp quay.io/myorg/myapp:${TRAVIS_BRANCH/\//-} \
    && docker push quay.io/myorg/myapp:${TRAVIS_BRANCH/\//-};

  # If this commit has a tag, use on the registry too.
  if ! test -z $TRAVIS_TAG; then
    docker tag myapp quay.io/myorg/myapp:${TRAVIS_TAG} \
      && docker push quay.io/myorg/myapp:${TRAVIS_TAG};
  fi
fi

As a result, we now have Docker images for our Haskell projects that are about 80 MB, which is not terrible, but can definitely be improved on.

The next steps for me are investigating how to make our images even smaller by using a smaller base image, and automate the deployment of development and staging environments by having Travis CI notify a scheduler that a new image has been built.

I’m including some of my scripts and an example Dockerfile on a GitHub Gist for reference. You will most likely have to modify them to meet your needs.


Eduardo Trujillo

Coming from OS X, I’ve grown accustomed to Tunnelblick, which is one of the best OpenVPN clients for the platform. It is not perfect and there are many commercial offerings out there that have a much nicer user interface, however Tunnelblick gets the job done and it’s open source.

On Linux, the story is a bit different. Most distributions come with NetworkManager, which is, as the name implies, a daemon for managing network connections. For most systems, it is the component that takes care of connecting to Wi-Fi networks, setting up an Ethernet connection when you plug in the cable, and even 3G/4G modems.

NetworkManager has support for plugins, which has led it to support many VPN protocols, including OpenVPN!

…it was pleasant to find that it not only is integrated with the main networking daemon, but it also supported on the UI-side…

When trying to figure out how to setup an OpenVPN client on Linux, it was pleasant to find that it not only is integrated with the main networking daemon, but it also supported on the UI-side, where most settings can be tweaked.

However, Tunnelblick still had something I couldn’t find how to do using NetworkManager alone: Connecting to the VPN automatically and reconnecting on cases where the connection is dropped.

For me, this is a must have feature for VPN clients, given that I tend to roam a lot with my laptop and can’t remember to connect every time it connects to a new network.

Some initial digging led me to an Arch Linux wiki page describing how to write a script which sort-of achieves what I’m looking for. However, the approach seemed brittle and insecure, due to the fact that you have to make the connection available to other users on the system, and in some cases write connection secrets in plaintext.

After a while, I attempted to start writing a small daemon that would monitor D-Bus and respond to NetworkManager events by determining if a VPN connection should be started or stopped. An initial version was capable of determining if the VPN connection is active. However, due to lack of free time to work on it and the complexity of keeping track of the state of the machine, I decided to put it on hold.

While working on this project, I did discover that NetworkManager does have some of this functionality built-in. It turns out you can specify a VPN to connect to as a requirement for some connections to succeed:

Automatic VPN connection settings
Automatic VPN connection settings

On Gentoo, this configuration can be accessed using nm-connection-editor, which can be installed using the gnome-extra/nm-applet package.

This is working great so far, but it does required some manual configuration for every new connection you setup, which can be annoying if you roam through many Wi-Fi networks.

In the future, I might resume work on the D-Bus approach in order to automate this a bit more. I would love it if my laptop simply did not trust any network and automatically attempted to connect to a VPN. It would also be nice if this is only attempted after a hotspot login is shown. For now, however, this should be enough.


Eduardo Trujillo
screenfetch output inside GNOME + Wayland
screenfetch output inside GNOME + Wayland

Not too long ago I left OS X and installed Fedora on my laptop. Now, over the past few days, I’ve been working on bootstrapping a Gentoo install from within my Fedora setup.

Unlike other Linux distributions, Gentoo does not have an installer. It is expected that you setup and install the system yourself. Yes. It is tedious and definitely takes longer than following a setup wizard, but on the other hand you gain some knowledge on how Linux works and end up with a custom system.

The learning factor was what convinced me to give Gentoo a try.

Given the particular route I decided to take, the Gentoo Handbook only helped on certain parts of the process. On a normal setup, you start with a live image of Gentoo containing just enough tools to install the system. It is also assumed that you will start with an empty disk or one that you don’t mind erasing.

On my case, I have an SSD that has a Fedora and OS X partitions, which are both encrypted using each operating system’s built-in encryption methods. The Linux side consisted of a LUKS partition containing a LVM setup.

Thus, the first step was to figure out how to make space for Gentoo while also avoid breaking the existing systems. Having LVM setup, certainly helped here. I shrunk my Fedora root (/) while keeping my home partition (/home) intact.

Diagram of my partitioning setup
Diagram of my partitioning setup

After that was done, I created a new partition for the root Gentoo filesystem and followed the section of the handbook on how to download the Stage3 archive and chrooting into the environment.

Once, chroot-ed, you can sort of begin using Gentoo and install packages using Portage. You can even run X programs if you have an X server running on the host. However, the goal here was to also boot into Gentoo directly. A chroot-ed environment meant that the system is still running using the Fedora kernel. A big part of installing Gentoo is building your own kernel and then rebooting into it.

Building a bootable kernel wasn’t too hard. I installed the kernel sources using Portage, configured it using make menuconfig, and compiled it. Once it was finished. I copied over to my /boot partition so that GRUB could find it.

A few test boots showed that the system was booting correctly (and very fast too). Unfortunately, once booted nothing happened because the disk partition is encrypted using LUKS.

Getting LUKS and LVM to work was bit painful due to how long it took me to find a working solution.

My initial attempt resulted in me trying to setup an initramfs using Fedora’s dracut tool, which was available on the Gentoo repository and mentioned in many guides on the distribution’s wiki. It seemed logical given that it was the same tool being used by the Fedora install, so getting it to work with Gentoo should have only consisted of building a similar image but pointing to the Gentoo partition.

That did not work out well. I was simply not able to get past the disk unlocking and mounting process. I think it may have been just trying to make many things work together: EFI, systemd, plymouth, cryptsetup, and lvm.

So, I continued with the other option presented by the Gentoo wiki, genkernel. This is a tool used to automate the process of building a kernel in Gentoo, but it also supports automatically building an initramfs. I initially tried to avoid it given that I wanted to build the kernel myself as a learning exercise. However, after installing it, I was pleased to find that building an initramfs did not involved building a kernel with genkernel.

A few key things to enable on my setup was the LUKS and LVM options on /etc/genkernel.conf, and adding dolvm to my kernel command line on GRUB.

BAM! It booted, asked for my password, and dropped my into a login shell.

I still had to go back to Fedora and chroot a couple of more times until I got the right network drivers compiled. It was nice to find out that plugging in a Thunderbolt Ethernet adapter worked and did not crashed the kernel like it does on Fedora.

Another thing I didn’t realize immediately was that the genkernel initramfs does not use systemd by default, so I had to add that to the kernel command line too.

Then I continued to install software and drivers, but this time from within Gentoo. GNOME took a good 4-6 hours to compile, mainly due to WebKit taking so long.

Once I started GDM, NetworkManager, and the bluetooth service using systemctl, I finally felt like I had a fully working system on my hands.

Everything worked, except one major peripheral, the trackpad: Apparently, the usbhid driver was claiming the device before the right driver, so the keyboard worked, but the trackpad was dead.

After a few hours of debugging, I gave up and decided to try compiling the latest kernel available (4.4.4 vs 4.1.15), and after rebooting the trackpad began working, including multitouch support!

Lastly, I switched from X to Wayland, given that X had an odd issue with some letters not rendering after the laptop came back from sleep. Adding Wayland support simply consisted of adding wayland wayland-compositor xwayland egl to my USE flags, and then recompiled the affected packages. After another reboot, “GNOME on Wayland” appeared as an option on the GDM login screen.

Summary

In general, I think I’m happy with my current build and will attempt to use it as my main driver for the next few weeks. If it turns out to not have any major issues, I’ll remove my fedora partition to clear up some space.

The good

  • Worked, almost, out-of-the-box: I was surprised by the amount of hardware that simply just worked out of the box by enabling the right kernel modules: GPU, Keyboard, Sound, Lid, Keyboard Backlight, Mic, Media Keys, USB, Thunderbolt.
  • Needed some extra work: Wi-Fi, Trackpad.
  • Closing the laptop actually puts it to sleep. No workarounds needed like in Fedora. Wayland is slightly better at this since it doesn’t turn on the screen for a few seconds while closed like Xorg does.
  • Power consumptions seems to be slightly better than Fedora, especially after enabling all the toggles on powertop.
  • The system boots really fast. Around, maybe, 20-40 seconds from GRUB to GDM?
  • The Gentoo repositories are generally closer to upstream than other distributions, and I’m also growing to like Portage’s USE keyword feature.
  • Learned a lot about how modern Linux systems are setup.

The bad

  • It’s not for everyone. Many users probably just want an OS that works out of the box.
  • The compile time of some packages can be very long (I’m looking at you WebKit).
  • Many commercial software packages are only distributed as debs or rpms.

…or you can find more in the archives.