+++ title = "Installing NixOS with ZFS mirrored boot" date = "2023-01-31" [taxonomies] categories = ["system"] tags = ["linux", "nixos"] +++ // TODO: add PlantUML diagrams ## Overview In this post, we're going to set up a ZFS mirrored boot system with full-disk encryption that is unlockable remotely. ## Preparing the installation medium This step may vary depending on what system you're going to install NixOS into. This post assumes that you're installing this on a normal server, with a minimal NixOS image. The community-maintained [NixOS wiki][nixos-wiki] contains guides to install NixOS to devices in other conditions, such as a server with only remote access. You will need a USB stick before proceeding to the next step. First, download the latest NixOS image, and flash it: ```sh $ curl -L https://channels.nixos.org/nixos-unstable/latest-nixos-minimal-x86_64-linux.iso -O nixos.iso $ dd if=./nixos.iso of=/dev/sdX bs=1M status=progress ``` If your target machine architecture is not `x86_64`, replace it with your desired architecture (e.g. `i686`, `aarch64`). After the image has been successfully flashed into your installation medium, unplug it and boot using the medium on the target machine. ## Preparing Disks We'll start by defining variables pointing to each disk by ID. According to the [Archlinux.org Wiki][arch-wiki], If you create zpools using device names (e.g. `/dev/sda`), ZFS might not be able to detect zpools intermittently on boot. You can grab the ID via `ls -lh /dev/disk/by-id/`. ```sh DISK1=/dev/disk/by-id/ata-VENDOR-ID-OF-THE-FIRST-DRIVE DISK2=/dev/disk/by-id/ata-VENDOR-ID-OF-THE-SECOND-DRIVE ``` ### Partitioning Then we'll partition our disks. Since this is a mirrored setup, we'll have to do the exactly same operation twice. Fortunately, bash function come into rescue. The partition structure is the following: ``` 1GiB Boot | ~Remaining ZFS ``` ```sh partition() { sgdisk --zap-all "$1" sgdisk -n 1:0:+1GiB -t 1:EF00 -c 1:boot "$1" # Swap is omitted. sgdisk -n 2:0:0 -t 2:BF01 -c 2:zfs "$1" sgdisk --print "$1" } partition $DISK1 partition $DISK2 ``` ### Creating vfat filesystem for boot Boot partitions should be formatted with 'vfat', in order for it to mount and function without issues. ```sh mkfs.vfat $DISK1-part1 mkfs.vfat $DISK2-part1 ``` ### Configuring ZFS pool This dataset structure is based on [Erase your darlings][erase-your-darlings]. Now that we're done partitioning our disks, we'll create a ZFS pool named 'rpool', which is mirrored. This will prompt you to enter a passphrase for your new ZFS pool. ```sh zpool create \ -o ashift=12 \ -O mountpoint=none -O atime=off -O acltype=posixacl -O xattr=sa \ -O compression=lz4 -O encryption=aes-256-gcm -O keyformat=passphrase \ rpool mirror \ $DISK1-part2 $DISK2-part2 ``` Then, we create a 'root dataset' which is `/ (root)` for the target machine, then snapshot the empty state as 'blank'. ```sh zfs create -p -o mountpoint=legacy rpool/local/root zfs snapshot rpool/local/root@blank ``` Note the 'local' after rpool. In this setup, 'local' is treated as unimportant data, i.e. packages, root, etc., Whereas 'safe' is treated as important data, which needs to be backed up. And mount it: ```sh mount -t zfs rpool/local/root /mnt ``` Then we mount the multiple boot partitions we created: ```sh mkdir /mnt/boot mkdir /mnt/boot-fallback mount $DISK1-part1 /mnt/boot mount $DISK2-part1 /mnt/boot-fallback ``` Create and mount a dataset for `/nix`: ```sh zfs create -p -o mountpoint=legacy rpool/local/nix mkdir /mnt/nix mount -t zfs rpool/local/nix /mnt/nix ``` And a dataset for `/home`: ```sh zfs create -p -o mountpoint=legacy rpool/safe/home mkdir /mnt/home mount -t zfs rpool/safe/home /mnt/home ``` And a dataset for states that needs to be persisted between boots: ```sh zfs create -p -o mountpoint=legacy rpool/safe/persist mkdir /mnt/persist mount -t zfs rpool/safe/persist /mnt/persist ``` Note: All states will be wiped each boot after setting up [these](#erasing-your-darlings). Make sure to put states that need to persist on `/persist`. ## Configuring NixOS Now that we're done with partitions and ZFS, it's time to declaratively configure the machine. This step may vary depending on your machine, please consult the docs when in doubt. ### Getting the base configuration In this post, we're going to use plain `nixos-generate-config` to get a base configuration files for the machine. ```sh nixos-generate-config --root /mnt ``` ### Erasing your darlings In the [previous step](#configuring-zfs-pool), we've made a snapshot of blank root to roll back to it each boot, to keep the system stateless. Add this to the `configuration.nix` to wipe the root dataset on each boot by rolling back to the blank snapshot after the devices are made available: ```nix { boot.initrd.postDeviceCommands = lib.mkAfter '' zfs rollback -r rpool/local/root@blank ''; } ``` ### Configuring Bootloader In order to get ZFS to work, we need the following options to be set: ```nix { boot.supportedFilesystems = [ "zfs" ]; networking.hostId = "<8 random chars>"; } ``` You can grab your machine ID at `/etc/machine-id` for the `hostId`. Then we'll configure grub: ```nix { # Whether installer can modify the EFI variables. # If you encounter errors, set this to `false`. boot.loader.efi.canTouchEfiVariables = true; boot.loader.grub.enable = true; boot.loader.grub.efiSupport = true; boot.loader.grub.device = "nodev"; # This should be done automatically, but explicitly declare it just in case. boot.loader.grub.copyKernels = true; # Make sure that you've listed all of the boot partitions here. boot.loader.grub.mirroredBoots = [ { path = "/boot"; devices = ["/dev/disk/by-uuid/"]; } { path = "/boot-fallback"; devices = ["/dev/disk/by-uuid/"]; } # ... ]; } ``` ### Handling boot partitions gracefully By default, NixOS will throw an error and complain about it when there is a missing partition/disk. Since we want the server to boot smoothly even if there is a missing boot partition, so we need to set the 'nofail' option to those partitions: ```nix { fileSystems."/boot".options = [ "nofail" ]; fileSystems."/boot-fallback".options = [ "nofail" ]; } ``` ### Enabling Remote ZFS Unlock On each boot, ZFS will ask for a passphrase to unlock the ZFS pool. To work around this issue, we can start an SSH server in `initrd`, that is going to live until the pool is unlocked. Note: If you rename the keys after, you may have some trouble rolling back to previous generations: See [here](caveat-remote-unlock) for details. To achieve that, we'll first have to generate an SSH host key for the initrd: ```sh ssh-keygen -t ed25519 -N "" -f /mnt/boot/initrd-ssh-key # Each boot partition should have the same key cp /mnt/boot/initrd-ssh-key /mnt/boot-fallback/initrd-ssh-key ``` Then configure `initrd`: ```nix { boot.kernelModules = [ "" ]; boot.initrd.kernelModules = [ "" ]; # DHCP Configuration, comment on Static IP networking.networkmanager.enable = false; networking.useDHCP = true; # Uncomment on Static IP # boot.kernelParams = [ # # See for documentation. # # ip=::::::::: # # The server ip refers to the NFS server -- not needed in this case. # "ip=::::-initrd::off:" # ]; boot.initrd.network.enable = true; boot.initrd.network.ssh = { enable = true; # Using the same port as the actual SSH will cause clients to throw errors # related to host key mismatch. port = 2222; # This takes 'path's, not 'string's. hostKeys = [ /boot/initrd-ssh-key /boot-fallback/initrd-ssh-key # ... ]; # Public ssh key to log into the initrd ssh authorizedKeys = [ "" ]; }; boot.initrd.network.postCommands = '' cat < /root/.profile if pgrep -x "zfs" > /dev/null then zfs load-key -a killall zfs else echo "ZFS is not running -- this could be a sign of failure." fi EOF ''; } ``` ## Installing NixOS Run `nixos-install`, then reboot your machine. Note: Make sure that you've configured SSH and network for your machine, failure to do so may result in an inaccessible system. That's it! Enjoy your fresh NixOS machine! ## Troubleshooting ### Failed to import pool - more than one matching pool This error might occur when - one of your disks were previously used in another ZFS pool, and its metadata weren't properly removed - you messed up during install, and you repartitioning the disk without removing its ZFS metadata. This is because the ZFS metadata doesn't live on a partition, but on a disk. Note: the following operations will irrevocably delete ANY data on your disk! To remove those left behind: ```sh sgdisk --zap-all $DISK # Overwrite first 256M of the disk, removing metadata # In some cases just `wipefs -a` works, but I found this to be the most # reliable way to wipe them no matter what operations were performed on the disk # before. dd if=/dev/urandom bs=1M count=256 of=$DISK ``` And then you can try the installation again. ## Conclusion ## Acknowledgements I wrote this article because I've noticed that I always forget some steps during NixOS installation to a newly acquired server. I've compiled resources listed below to make a step-by-step guide for a setup I find 'optimal'. Please do check out those resources! - [NixOS Discourse Thread][discourse-thread] - [Erase your darlings][erase-your-darlings] - [Remote, encrypted ZFS storage server with NixOS][hetzner-zfs] - [Encrypted ZFS mirror with mirrored boot on NixOS][nixos-zfs-mirrored-boot] [erase-your-darlings]: https://grahamc.com/blog/erase-your-darlings [nixos-wiki]: https://nixos.wiki [arch-wiki]: https://wiki.archlinux.org [caveat-remote-unlock]: https://github.com/NixOS/nixpkgs/issues/101462#issuecomment-1172926129 [discourse-thread]: https://discourse.nixos.org/t/nixos-on-mirrored-ssd-boot-swap-native-encrypted-zfs/9215 [hetzner-zfs]: https://mazzo.li/posts/hetzner-zfs.html [nixos-zfs-mirrored-boot]: https://elis.nu/blog/2019/08/encrypted-zfs-mirror-with-mirrored-boot-on-nixos