about summary refs log tree commit diff
path: root/content/posts/2023-01-31-nixos-zfs-mirrored-boot.md
blob: 12ac698638f8035d7287f56c19667f80aa7dfcc2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
+++
title = "Installing NixOS with ZFS mirrored boot"
date = "2023-01-31"

[taxonomies]
categories = ["system"]
tags = ["linux", "nixos"]
+++

// TODO: add PlantUML diagrams

## Overview

In this post, we're going to set up a ZFS mirrored boot system with full-disk encryption that is unlockable remotely.

## Preparing the installation medium

This step may vary depending on what system you're going to install NixOS into.

This post assumes that you're installing this on a normal server, with a
minimal NixOS image.

The community-maintained [NixOS wiki][nixos-wiki] contains guides to install
NixOS to devices in other conditions, such as a server with only remote access.

You will need a USB stick before proceeding to the next step.

First, download the latest NixOS image, and flash it:

```sh
$ curl -L https://channels.nixos.org/nixos-unstable/latest-nixos-minimal-x86_64-linux.iso -O nixos.iso
$ dd if=./nixos.iso of=/dev/sdX bs=1M status=progress
```

If your target machine architecture is not `x86_64`, replace it with your
desired architecture (e.g. `i686`, `aarch64`).

After the image has been successfully flashed into your installation medium,
unplug it and boot using the medium on the target machine.

## Preparing Disks

We'll start by defining variables pointing to each disk by ID.

According to the [Archlinux.org Wiki][arch-wiki], If you create zpools using device names
(e.g. `/dev/sda`), ZFS might not be able to detect zpools intermittently on
boot.

You can grab the ID via `ls -lh /dev/disk/by-id/`.

```sh
DISK1=/dev/disk/by-id/ata-VENDOR-ID-OF-THE-FIRST-DRIVE
DISK2=/dev/disk/by-id/ata-VENDOR-ID-OF-THE-SECOND-DRIVE
```

### Partitioning

Then we'll partition our disks. Since this is a mirrored setup, we'll have to do
the exactly same operation twice. Fortunately, bash function come into rescue.

The partition structure is the following:
```
1GiB Boot | ~Remaining ZFS
```


```sh
partition() {
    sgdisk --zap-all "$1"
    sgdisk -n 1:0:+1GiB -t 1:EF00 -c 1:boot "$1"
    # Swap is omitted.
    sgdisk -n 2:0:0 -t 2:BF01 -c 2:zfs "$1"
    sgdisk --print "$1"
}

partition $DISK1
partition $DISK2
```

### Creating vfat filesystem for boot

Boot partitions should be formatted with 'vfat', in order for it to mount and
function without issues.

```sh
mkfs.vfat $DISK1-part1
mkfs.vfat $DISK2-part1
```

### Configuring ZFS pool

This dataset structure is based on [Erase your darlings][erase-your-darlings].

Now that we're done partitioning our disks, we'll create a ZFS pool named
'rpool', which is mirrored. This will prompt you to enter a passphrase for your
new ZFS pool.
```sh
zpool create \
    -o ashift=12 \
    -O mountpoint=none -O atime=off -O acltype=posixacl -O xattr=sa \
    -O compression=lz4 -O encryption=aes-256-gcm -O keyformat=passphrase \
    rpool mirror \
    $DISK1-part2 $DISK2-part2
```

Then, we create a 'root dataset' which is `/ (root)` for the target machine,
then snapshot the empty state as 'blank'.
```sh
zfs create -p -o mountpoint=legacy rpool/local/root
zfs snapshot rpool/local/root@blank
```

Note the 'local' after rpool. In this setup, 'local' is treated as unimportant
data, i.e. packages, root, etc., Whereas 'safe' is treated as important data,
which needs to be backed up.

And mount it:
```sh
mount -t zfs rpool/local/root /mnt
```

Then we mount the multiple boot partitions we created:
```sh
mkdir /mnt/boot
mkdir /mnt/boot-fallback

mount $DISK1-part1 /mnt/boot
mount $DISK2-part1 /mnt/boot-fallback
```

Create and mount a dataset for `/nix`:
```sh
zfs create -p -o mountpoint=legacy rpool/local/nix
mkdir /mnt/nix
mount -t zfs rpool/local/nix /mnt/nix
```

And a dataset for `/home`:
```sh
zfs create -p -o mountpoint=legacy rpool/safe/home
mkdir /mnt/home
mount -t zfs rpool/safe/home /mnt/home
```

And a dataset for states that needs to be persisted between boots:
```sh
zfs create -p -o mountpoint=legacy rpool/safe/persist
mkdir /mnt/persist
mount -t zfs rpool/safe/persist /mnt/persist
```

Note: All states will be wiped each boot after setting up
[these](#erasing-your-darlings).
Make sure to put states that need to persist on `/persist`.


## Configuring NixOS

Now that we're done with partitions and ZFS, it's time to declaratively
configure the machine. This step may vary depending on your machine,
please consult the docs when in doubt.

### Getting the base configuration

In this post, we're going to use plain `nixos-generate-config` to get a base
configuration files for the machine.

```sh
nixos-generate-config --root /mnt
```

### Erasing your darlings

In the [previous step](#configuring-zfs-pool), we've made a snapshot of blank
root to roll back to it each boot, to keep the system stateless.

Add this to the `configuration.nix` to wipe the root dataset on each boot by
rolling back to the blank snapshot after the devices are made available:
```nix
{
  boot.initrd.postDeviceCommands = lib.mkAfter ''
    zfs rollback -r rpool/local/root@blank
  '';
}
```

### Configuring Bootloader

In order to get ZFS to work, we need the following options to be set:
```nix
{
  boot.supportedFilesystems = [ "zfs" ];
  networking.hostId = "<8 random chars>";
}
```

You can grab your machine ID at `/etc/machine-id` for the `hostId`.

Then we'll configure grub:
```nix
{
  # Whether installer can modify the EFI variables.
  # If you encounter errors, set this to `false`.
  boot.loader.efi.canTouchEfiVariables = true;

  boot.loader.grub.enable = true;
  boot.loader.grub.efiSupport = true;
  boot.loader.grub.device = "nodev";

  # This should be done automatically, but explicitly declare it just in case.
  boot.loader.grub.copyKernels = true;
  # Make sure that you've listed all of the boot partitions here.
  boot.loader.grub.mirroredBoots = [
    { path = "/boot"; devices = ["/dev/disk/by-uuid/<ID-HERE>"]; }
    { path = "/boot-fallback"; devices = ["/dev/disk/by-uuid/<ID-HERE>"]; }
    # ...
  ];
}
```

### Handling boot partitions gracefully

By default, NixOS will throw an error and complain about it when there is a
missing partition/disk. Since we want the server to boot smoothly even if there
is a missing boot partition, so we need to set the 'nofail' option to those
partitions:

```nix
{
  fileSystems."/boot".options = [ "nofail" ];
  fileSystems."/boot-fallback".options = [ "nofail" ];
}
```


### Enabling Remote ZFS Unlock

On each boot, ZFS will ask for a passphrase to unlock the ZFS pool.
To work around this issue, we can start an SSH server in `initrd`, that is going
to live until the pool is unlocked.

Note: If you rename the keys after, you may have some trouble rolling back to
previous generations: See [here](caveat-remote-unlock) for details.

To achieve that, we'll first have to generate an SSH host key for the initrd:
```sh
ssh-keygen -t ed25519 -N "" -f /mnt/boot/initrd-ssh-key

# Each boot partition should have the same key
cp /mnt/boot/initrd-ssh-key /mnt/boot-fallback/initrd-ssh-key
```

Then configure `initrd`:
```nix
{
  boot.kernelModules = [ "<YOUR-NETWORK-CARD>" ];
  boot.initrd.kernelModules = [ "<YOUR-NETWORK-CARD>" ];

  # DHCP Configuration, comment on Static IP
  networking.networkmanager.enable = false;
  networking.useDHCP = true;

  # Uncomment on Static IP
  # boot.kernelParams = [
  #   # See <https:#www.kernel.org/doc/Documentation/filesystems/nfs/nfsroot.txt> for documentation.
  #   # ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:<dns0-ip>:<dns1-ip>:<ntp0-ip>
  #   # The server ip refers to the NFS server -- not needed in this case.
  #   "ip=<YOUR-IPV4-ADDR>::<YOUR-IPV4-GATEWAY>:<YOUR-IPV4-NETMASK>:<YOUR-HOSTNAME>-initrd:<YOUR-NETWORK-INTERFACE>:off:<DNS-IP>"
  # ];

  boot.initrd.network.enable = true;
  boot.initrd.network.ssh = {
    enable = true;

    # Using the same port as the actual SSH will cause clients to throw errors
    # related to host key mismatch.
    port = 2222;

    # This takes 'path's, not 'string's.
    hostKeys = [
      /boot/initrd-ssh-key
      /boot-fallback/initrd-ssh-key
      # ...
    ];

    # Public ssh key to log into the initrd ssh
    authorizedKeys = [ "<YOUR-SSH-PUBKEY>" ];
  };
  boot.initrd.network.postCommands = ''
    cat <<EOF > /root/.profile
    if pgrep -x "zfs" > /dev/null
    then
      zfs load-key -a
      killall zfs
    else
      echo "ZFS is not running -- this could be a sign of failure."
    fi
    EOF
  '';
}
```

## Installing NixOS

Run `nixos-install`, then reboot your machine.

Note: Make sure that you've configured SSH and network for your machine,
failure to do so may result in an inaccessible system.

That's it! Enjoy your fresh NixOS machine!

## Troubleshooting

### Failed to import pool - more than one matching pool

This error might occur when

- one of your disks were previously used in another ZFS pool, and its metadata
weren't properly removed
- you messed up during install, and you repartitioning the disk without removing
  its ZFS metadata.

This is because the ZFS metadata doesn't live on a partition, but on a disk.

Note: the following operations will irrevocably delete ANY data on your disk!

To remove those left behind:

```sh
sgdisk --zap-all $DISK
# Overwrite first 256M of the disk, removing metadata
# In some cases just `wipefs -a` works, but I found this to be the most
# reliable way to wipe them no matter what operations were performed on the disk
# before.
dd if=/dev/urandom bs=1M count=256 of=$DISK
```

And then you can try the installation again.

## Conclusion

## Acknowledgements

I wrote this article because I've noticed that I always forget some steps
during NixOS installation to a newly acquired server.

I've compiled resources listed below to make a step-by-step guide for a setup I
find 'optimal'. Please do check out those resources!

- [NixOS Discourse Thread][discourse-thread]
- [Erase your darlings][erase-your-darlings]
- [Remote, encrypted ZFS storage server with NixOS][hetzner-zfs]
- [Encrypted ZFS mirror with mirrored boot on NixOS][nixos-zfs-mirrored-boot]

[erase-your-darlings]: https://grahamc.com/blog/erase-your-darlings
[nixos-wiki]: https://nixos.wiki
[arch-wiki]: https://wiki.archlinux.org
[caveat-remote-unlock]: https://github.com/NixOS/nixpkgs/issues/101462#issuecomment-1172926129
[discourse-thread]: https://discourse.nixos.org/t/nixos-on-mirrored-ssd-boot-swap-native-encrypted-zfs/9215
[hetzner-zfs]: https://mazzo.li/posts/hetzner-zfs.html
[nixos-zfs-mirrored-boot]: https://elis.nu/blog/2019/08/encrypted-zfs-mirror-with-mirrored-boot-on-nixos