Skip to main content

Ubuntu 24.04 ZBM ZFS: Installation Guide

ubuntu zfs zbm

Complete installation guide for Ubuntu 24.04 on encrypted ZFS with redundant boot partitions.

Prerequisites

Ensure your system meets these requirements before starting the installation:

  • System uses UEFI to boot
  • System is x86_64 architecture
  • Two identical SATA SSDs for OS and containers (fast pool)
  • Network access for package installation

Live Environment Setup

Boot from Ubuntu Live USB and prepare the environment for installation.

Download and Boot

Download the latest Ubuntu Desktop Noble (24.04) Live image, write it to a USB drive and boot your system in EFI mode.

Optional: Enable Remote Access (Live Environment)

If you prefer to perform the installation remotely from another workstation, you can enable SSH access in the live environment:

# Set a password for the ubuntu user
passwd ubuntu

# Install OpenSSH server and tmux
sudo apt update
sudo apt install --yes openssh-server

# Find your IP address
ip addr show

# You can now connect from another machine:
# ssh ubuntu@<ip-address>

Disable Automount

Disable automatic mounting to prevent the desktop environment from interfering with disk operations:

gsettings set org.gnome.desktop.media-handling automount false

Open Root Shell

sudo -i

Install Required Tools

apt update
apt install --yes debootstrap gdisk zfsutils-linux
systemctl stop zed

Generate Host ID

zgenhostid -f

Disk Preparation

Identify disks using persistent paths and wipe all existing data to prepare for ZFS installation.

Identify Disks by ID

First, list all available disks by their persistent IDs:

# List all disk IDs
ls -la /dev/disk/by-id/ | grep -v part

# Show disk model and serial information
lsblk -o NAME,SIZE,MODEL,SERIAL

IMPORTANT: Always use /dev/disk/by-id/* paths for disk references. These paths are persistent across reboots, unlike /dev/sdX which can change.

Set Disk Variables

Set your disk variables using the full /dev/disk/by-id/ paths found above:

# Set OS disk variables
# Replace these with YOUR actual disk IDs from the ls command above
export OS_DISK1="/dev/disk/by-id/nvme-Force_MP510_1919820500012769305E"
export OS_DISK2="/dev/disk/by-id/nvme-WD_BLACK_SN770_250GB_2346FX400125"

Note: The example paths above are placeholders. You must use the actual paths from your system.

Clear Disks

IMPORTANT: This will destroy all data on the selected disks. Verify your disk variables before proceeding.

# Clear any existing ZFS labels
zpool labelclear -f $OS_DISK1 2>/dev/null || true
zpool labelclear -f $OS_DISK2 2>/dev/null || true

# Stop any existing mdraid arrays
umount /boot/efi 2>/dev/null || true
mdadm --stop /dev/md127 2>/dev/null || true
mdadm --stop /dev/md/esp 2>/dev/null || true

# Clear mdraid metadata from partitions (if they exist)
mdadm --zero-superblock --force ${OS_DISK1}-part1 2>/dev/null || true
mdadm --zero-superblock --force ${OS_DISK2}-part1 2>/dev/null || true

# Wipe filesystem signatures
wipefs -a $OS_DISK1
wipefs -a $OS_DISK2

# TRIM/discard entire disk (SSD optimization)
# Note: blkdiscard may fail on NVMe drives in live environment, this is normal
blkdiscard -f $OS_DISK1 2>/dev/null || true
blkdiscard -f $OS_DISK2 2>/dev/null || true

# Zap GPT and MBR partition tables
sgdisk --zap-all $OS_DISK1
sgdisk --zap-all $OS_DISK2

# Verify disks are clean
lsblk -o NAME,SIZE,FSTYPE,LABEL $OS_DISK1 $OS_DISK2

Redundant ESP with mdraid

Create mirrored EFI System Partitions for boot redundancy using mdraid.

Create EFI System Partitions

Create 512MB EFI System Partitions on both SSDs:

OS_DISKS="$OS_DISK1 $OS_DISK2"
for disk in $OS_DISKS; do
    sgdisk -n "1:1m:+512m" -t "1:ef00" "$disk"
done

Create mdraid Array for ESP

mdadm --create --verbose --level 1 --metadata 1.0 --homehost any --raid-devices 2 /dev/md/esp \
  ${OS_DISK1}-part1 ${OS_DISK2}-part1
mdadm --assemble --scan
mdadm --detail --scan >> /etc/mdadm.conf

Note: Metadata version 1.0 is used so that mdraid metadata is written to the end of each partition, allowing firmware to recognize each component as a valid EFI system partition.

Create ZFS Partitions on SSDs

# Create ZFS partitions on SSDs (remaining space after ESP)
for disk in $OS_DISKS; do
    sgdisk -n "2:0:-8m" -t "2:bf00" "$disk"
done
partprobe

Storage Optimizations

Configure optimal I/O alignment for modern SSDs.

Understanding ashift

Modern drives (SSDs and HDDs manufactured after 2011) use 4096-byte (4K) physical sectors. ZFS uses the ashift parameter to align I/O operations with the disk’s physical sector size for optimal performance.

This guide uses ashift=12 (4096 bytes) for all pools, which is optimal for modern storage devices

ZFS OS Pool Creation

Create an encrypted, mirrored ZFS pool for the operating system with optimized properties.

Prepare Encryption Key

Create a key file for pool encryption:

echo 'password' > /etc/zfs/zroot.key
chmod 000 /etc/zfs/zroot.key

Create OS Pool

zpool create -f \
  -m none \
  -O acltype=posixacl \
  -o ashift=12 \
  -O atime=off \
  -o autotrim=on \
  -o cachefile=none \
  -O canmount=off \
  -o compatibility=openzfs-2.2-linux \
  -O compression=zstd \
  -O dnodesize=auto \
  -O encryption=aes-256-gcm \
  -O keyformat=passphrase \
  -O keylocation=file:///etc/zfs/zroot.key \
  -O normalization=formD \
  -O recordsize=16K \
  -O relatime=off \
  -O xattr=sa \
  zroot mirror ${OS_DISK1}-part2 ${OS_DISK2}-part2

ZFS Filesystem Creation

Create only the essential filesystems needed to boot the system.

Root Filesystem Setup

Create the root container and boot environment:

zfs create -o canmount=off -o mountpoint=none zroot/ROOT
zfs create -o canmount=noauto -o mountpoint=/ zroot/ROOT/ubuntu

Keystore Setup

Create a dedicated dataset for encryption keys:

zfs create -o mountpoint=/etc/zfs/keys zroot/keystore

System Directories

Create optimized datasets for system directories:

zfs create -o mountpoint=/var zroot/ROOT/ubuntu/var
zfs create -o mountpoint=/var/cache -o recordsize=128K -o sync=disabled \
  zroot/ROOT/ubuntu/var/cache

zfs create -o mountpoint=/var/lib -o recordsize=8K zroot/ROOT/ubuntu/var/lib
zfs create -o mountpoint=/var/log -o recordsize=128K -o logbias=throughput \
  zroot/ROOT/ubuntu/var/log

zfs create -o mountpoint=/tmp -o recordsize=32K -o compression=lz4 -o devices=off -o exec=off \
  -o setuid=off -o sync=disabled zroot/ROOT/ubuntu/tmp

zfs create -o mountpoint=/var/tmp -o recordsize=32K -o compression=lz4 -o devices=off -o exec=off \
  -o setuid=off -o sync=disabled zroot/ROOT/ubuntu/var/tmp

User Data Setup

Create datasets for user home directories:

# User data container
zfs create -o mountpoint=/home -o recordsize=128K zroot/USERDATA
zfs create -o mountpoint=/root zroot/USERDATA/root
zfs create zroot/USERDATA/marc

Finalize and Mount

Configure ZFSBootMenu properties and mount all datasets:

# Set bootfs property for ZBM
zpool set bootfs=zroot/ROOT/ubuntu zroot
zfs set keylocation=file:///etc/zfs/keys/zroot.key zroot
zfs set org.zfsbootmenu:keysource=zroot/keystore zroot

# Export & reimport with different mount path
zpool export zroot
zpool import -N -R /mnt zroot
zfs load-key -L prompt zroot

# Import datasets in correct order
zfs mount zroot/ROOT/ubuntu
zfs mount zroot/keystore
zfs mount -a

# Update device symlinks
udevadm trigger

ZFS Dataset Properties Overview

DatasetcanmountmountpointrecordsizecompressionAdditional Properties
zroot/ROOToffnone(inherited)(inherited)-
zroot/ROOT/ubuntunoauto/(inherited)(inherited)-
zroot/keystore(inherited)/etc/zfs/keys(inherited)(inherited)readonly=on
zroot/ROOT/ubuntu/var(inherited)/var(inherited)(inherited)-
zroot/ROOT/ubuntu/var/cache(inherited)/var/cache128K(inherited)sync=disabled
zroot/ROOT/ubuntu/var/lib(inherited)/var/lib8K(inherited)-
zroot/ROOT/ubuntu/var/log(inherited)/var/log128K(inherited)logbias=throughput
zroot/ROOT/ubuntu/tmp(inherited)/tmp32Klz4devices=off, exec=off, setuid=off, sync=disabled
zroot/ROOT/ubuntu/var/tmp(inherited)/var/tmp32Klz4devices=off, exec=off, setuid=off, sync=disabled
zroot/USERDATA(inherited)/home128K(inherited)-
zroot/USERDATA/root(inherited)/root(inherited)(inherited)-
zroot/USERDATA/marc(inherited)/home/marc(inherited)(inherited)-

Ubuntu Installation

Install Ubuntu base system using debootstrap and configure essential system settings.

Install Base System

debootstrap noble /mnt

Copy Configuration Files

cp /etc/hostid /mnt/etc/hostid
cp /etc/mdadm.conf /mnt/etc/
cp /etc/resolv.conf /mnt/etc/resolv.conf
cp /etc/zfs/zroot.key /mnt/etc/zfs/keys/zroot.key

Chroot into New System

mount -t proc proc /mnt/proc
mount -t sysfs sys /mnt/sys
mount -B /dev /mnt/dev
mount -t devpts pts /mnt/dev/pts
chroot /mnt /bin/bash

Basic Ubuntu Configuration

Set hostname:

echo 'bender' > /etc/hostname
echo -e '127.0.1.1\tbender' >> /etc/hosts

Set root password:

passwd

Configure APT sources:

cat <<EOF > /etc/apt/sources.list
deb http://archive.ubuntu.com/ubuntu/ noble main restricted universe multiverse
deb http://archive.ubuntu.com/ubuntu/ noble-updates main restricted universe multiverse
deb http://archive.ubuntu.com/ubuntu/ noble-security main restricted universe multiverse
deb http://archive.ubuntu.com/ubuntu/ noble-backports main restricted universe multiverse
EOF

Update system and install base packages:

apt update
apt upgrade --yes
apt install --yes --no-install-recommends \
  console-setup \
  keyboard-configuration \
  linux-generic-hwe-24.04 \
  locales

Configure locales and timezone:

dpkg-reconfigure locales tzdata keyboard-configuration console-setup

Even if you prefer a non-English system language, always ensure that en_US.UTF-8 is available

Create User Account

Create a regular user account with sudo privileges:

# Create user (replace 'marc' with your desired username throughout)
adduser marc
cp -a /etc/skel/. /home/marc
chown -R marc:marc /home/marc

# Add user to supplementary groups
usermod -aG adm,plugdev,sudo marc

Note: You will be prompted to set a password and user information during adduser. Ignore any error messages regarding existing home directory.

Configure Network

cat > /etc/netplan/01-enp4s0.yaml << 'EOF'
network:
  version: 2
  renderer: networkd
  ethernets:
    enp4s0:
      dhcp4: true
      dhcp6: true
      accept-ra: true
      ipv6-privacy: false
EOF

chown 600 /etc/netplan/01-enp4s0.yaml
netplan apply

ZFS Configuration

Install ZFS packages, enable required services, and configure encryption settings.

Install ZFS and Essential System Packages

apt install --yes dosfstools mdadm zfs-initramfs zfsutils-linux

Enable ZFS Services

systemctl enable zfs.target
systemctl enable zfs-mount
systemctl enable zfs-import.target

Secure the Keystore

Set the keystore to read-only for additional security:

zfs set readonly=on zroot/keystore

Configure initramfs for Encryption

echo "UMASK=0077" > /etc/initramfs-tools/conf.d/umask.conf

Rebuild initramfs

update-initramfs -c -k all

zfs set org.zfsbootmenu:commandline="quiet" zroot/ROOT

ZFSBootMenu Setup

Configure the EFI System Partition, install ZFSBootMenu, and create redundant UEFI boot entries.

Format and Mount ESP

mkfs.vfat -F32 -nBOOT /dev/md/esp
mkdir -p /boot/efi
mount -t vfat /dev/md/esp /boot/efi/

Add ESP to fstab

cat << EOF >> /etc/fstab
$( blkid | grep BOOT | cut -d ' ' -f 4 ) /boot/efi vfat defaults 0 0
EOF

Install ZFSBootMenu

Install curl and download ZFSBootMenu EFI image:

apt install --yes curl

mkdir -p /boot/efi/EFI/ZBM
curl -o /boot/efi/EFI/ZBM/VMLINUZ.EFI -L https://get.zfsbootmenu.org/efi
cp /boot/efi/EFI/ZBM/VMLINUZ.EFI /boot/efi/EFI/ZBM/VMLINUZ-BACKUP.EFI

Create UEFI Boot Entries

Mount EFI variables and install efibootmgr:

mount -t efivarfs efivarfs /sys/firmware/efi/efivars
apt install --yes efibootmgr

Create boot entries for both disks (redundancy):

# Disk 2 boot entries
efibootmgr -c -d "$OS_DISK2" -p 1 -L "ZBM 2 (Backup)" -l '\EFI\ZBM\VMLINUZ-BACKUP.EFI'
efibootmgr -c -d "$OS_DISK2" -p 1 -L "ZBM 2" -l '\EFI\ZBM\VMLINUZ.EFI'

# Disk 1 boot entries
efibootmgr -c -d "$OS_DISK1" -p 1 -L "ZBM 1 (Backup)" -l '\EFI\ZBM\VMLINUZ-BACKUP.EFI'
efibootmgr -c -d "$OS_DISK1" -p 1 -L "ZBM 1" -l '\EFI\ZBM\VMLINUZ.EFI'

Boot entry redundancy:

  • Each disk has two boot entries (primary and backup)
  • If one disk fails, system can boot from the other
  • If primary boot image fails, backup provides recovery option

Verify boot entries:

efibootmgr -v

Final Steps

Clean up the installation environment, reboot into the new system, and verify everything works correctly.

Exit Chroot and Cleanup

# Exit the chroot environment
exit

# Unmount all filesystems
umount -n -R /mnt

# Export the pool
zpool export zroot

Reboot into New System

Remove the installation media and reboot:

reboot

First boot process:

  1. UEFI firmware loads ZFSBootMenu
  2. ZFSBootMenu prompts for encryption password
  3. ZFSBootMenu displays available boot environments
  4. Select zroot/ROOT/ubuntu to boot

Post-Installation Verification

After first boot, verify the system:

# Check ZFS pool status
zpool status

# Verify all datasets are mounted
zfs list

# Check mdraid status for ESP
cat /proc/mdstat

# Verify network connectivity
ip addr show
ping -c3 google.com

# Check system services
systemctl status zfs-mount

Create Post-Installation Snapshot

Before creating snapshots, prevent GRUB packages from being installed when upgrading the kernel:

# Prevent GRUB updates that could interfere with ZFSBootMenu
tee /etc/apt/preferences.d/no-grub << 'EOF'
Package: grub* grub2*
Pin: release *
Pin-Priority: -1
EOF

Create a snapshot of the fresh installation for easy rollback:

# Snapshot essential system datasets
zfs snapshot zroot/ROOT/ubuntu@fresh-install
zfs snapshot zroot/ROOT/ubuntu/var@fresh-install
zfs snapshot zroot/ROOT/ubuntu/var/lib@fresh-install

# Verify snapshots
zfs list -t snapshot

Appendix 1: Recovery Mode

If you need to recover your system or make changes from a live environment, use these commands to mount and chroot back into your installed system.

When to Use Recovery Mode

  • System won’t boot but pools are intact
  • Need to fix boot configuration
  • Want to update ZFSBootMenu
  • Need to recover from failed updates
  • Troubleshooting boot issues

Recovery Steps

Boot from the Ubuntu Live USB and open a root shell:

sudo -i

Install ZFS utilities if needed:

apt update
apt install --yes zfsutils-linux

Mount the ESP and import ZFS pools:

# Find the md device for ESP (it might have a different number)
cat /proc/mdstat
# Look for the md device with the ESP partitions

# If the array is not running, start it first
# Replace md127 with the actual md number from /proc/mdstat
mdadm --run /dev/md127

# Mount the ESP
mkdir -p /mnt/boot/efi/
mount -t vfat /dev/md/esp /mnt/boot/efi/

# Export any auto-imported pools
zpool export -a

# Import the zroot pool
zpool import -f -N -R /mnt zroot

# Load encryption keys
zfs load-key -L prompt zroot

# Mount filesystems
zfs mount zroot/ROOT/ubuntu
zfs mount zroot/keystore
zfs mount -a

# Mount proc, sys, dev for chroot
mount -t proc proc /mnt/proc
mount -t sysfs sys /mnt/sys
mount -B /dev /mnt/dev
mount -t devpts pts /mnt/dev/pts

# Enter the chroot environment
chroot /mnt /bin/bash

Exit Recovery Mode

When finished, clean up and reboot:

# Exit chroot
exit

# Unmount everything
umount -n -R /mnt

# Export pool
zpool export zroot

# Reboot
reboot

Appendix 2: Replacing a Faulted Drive

This guide covers how to replace a failed drive in your mirrored ZFS pool and mdraid ESP array.

When to Replace a Drive

Replace a drive when:

  • ZFS pool shows DEGRADED status
  • A drive shows as FAULTED or UNAVAIL in zpool status
  • SMART errors indicate imminent drive failure
  • The mdraid array shows a failed component

Check Pool and Array Status

First, identify which drive has failed:

# Check ZFS pool status
zpool status zroot

# Check mdraid array status
cat /proc/mdstat

# If the array is not running, start it first
# Replace md127 with the actual md number from /proc/mdstat
mdadm --run /dev/md127

mdadm --detail /dev/md/esp

Example output showing a faulted drive:

  pool: zroot
 state: DEGRADED
status: One or more devices could not be used.
  scan: resilvered 245G in 01:23:45 with 0 errors
config:

    NAME                                             STATE     READ WRITE CKSUM
    zroot                                            DEGRADED     0     0     0
      mirror-0                                       DEGRADED     0     0     0
        nvme-Force_MP510_1919820500012769305E-part2  ONLINE       0     0     0
        nvme-WD_BLACK_SN770_250GB_2346FX400125-part2 FAULTED      0     0     0

Identify the Failed Physical Drive

Find the disk-by-id path for reference:

# List disk IDs to identify replacement
ls -la /dev/disk/by-id/ | grep -v part

# Note the disk path of the failed drive
# Example: /dev/disk/by-id/nvme-WD_BLACK_SN770_250GB_2346FX400125

Physical Drive Replacement

  1. Shutdown the system:
shutdown -h now
  1. Replace the physical drive with a new one of equal or larger size
  2. Boot the system
  3. Identify the new drive:
# List available disks to find the new drive ID
ls -la /dev/disk/by-id/ | grep -v part
lsblk -o NAME,SIZE,MODEL,SERIAL

# Set variable for new disk (replace with your actual disk ID)
export NEW_DISK="/dev/disk/by-id/nvme-NEW_DRIVE_SERIAL_HERE"

Partition the New Drive

Match the partition layout of the working drive:

# Copy partition table from working drive to new drive
# Replace OS_DISK1 with the path of your WORKING drive
export WORKING_DISK="/dev/disk/by-id/nvme-Force_MP510_1919820500012769305E"

sgdisk --replicate=$NEW_DISK $WORKING_DISK
sgdisk --randomize-guids $NEW_DISK

# Verify partitions were created
lsblk $NEW_DISK

Alternative manual partitioning:

# If sgdisk --replicate fails, partition manually
sgdisk -n "1:1m:+512m" -t "1:ef00" "$NEW_DISK"
sgdisk -n "2:0:-8m" -t "2:bf00" "$NEW_DISK"
partprobe

Replace Drive in mdraid ESP Array

Add the new ESP partition to the mdraid array:

# Add new partition to mdraid array
mdadm --run /dev/md127
mdadm --manage /dev/md127 --add ${NEW_DISK}-part1

# Monitor rebuild progress
watch cat /proc/mdstat

# Wait for rebuild to complete (shows [UU] when done)
# This usually takes a few minutes for a 512MB partition

Replace Drive in ZFS Pool

Replace the faulted drive in the ZFS pool:

# Replace the faulted drive in ZFS pool
# Use the FAULTED disk path from 'zpool status' output
export OLD_DISK="/dev/disk/by-id/nvme-WD_BLACK_SN770_250GB_2346FX400125"

zpool replace zroot ${OLD_DISK}-part2 ${NEW_DISK}-part2

# Monitor resilver progress
watch zpool status zroot

Resilver progress example:

  scan: resilver in progress since Sat Jan 01 12:34:56 2025
        245G scanned at 2.1G/s, 123G issued at 1.8G/s, 245G total
        123G resilvered, 50.2% done, 00:01:08 to go

Create UEFI Boot Entries for New Drive

Once resilvering is complete, add boot entries for the new drive:

# Mount EFI variables if not already mounted
mount -t efivarfs efivarfs /sys/firmware/efi/efivars 2>/dev/null || true

# Create boot entries for new disk
efibootmgr -c -d "$NEW_DISK" -p 1 \
  -L "ZBM NEW (Backup)" \
  -l '\EFI\ZBM\VMLINUZ-BACKUP.EFI'

efibootmgr -c -d "$NEW_DISK" -p 1 \
  -L "ZBM NEW" \
  -l '\EFI\ZBM\VMLINUZ.EFI'

# Verify boot entries
efibootmgr -v

Optional: Remove old boot entries for the failed drive:

# List boot entries to find old entries
efibootmgr -v

# Delete old entries by boot number (e.g., Boot0003)
# efibootmgr -b 0003 -B

Verify Replacement

After resilver completes, verify everything is working:

# Check pool status (should show ONLINE)
zpool status zroot

# Check mdraid status (should show [UU])
cat /proc/mdstat
mdadm --detail /dev/md/esp

# Verify all datasets are accessible
zfs list

# Check SMART status of new drive
smartctl -a $NEW_DISK

# Verify boot entries
efibootmgr -v

Expected pool status after successful replacement:

  pool: zroot
 state: ONLINE
  scan: resilvered 245G in 01:23:45 with 0 errors
config:

    NAME                                             STATE     READ WRITE CKSUM
    zroot                                            ONLINE       0     0     0
      mirror-0                                       ONLINE       0     0     0
        nvme-Force_MP510_1919820500012769305E-part2  ONLINE       0     0     0
        nvme-NEW_DRIVE_SERIAL_HERE-part2             ONLINE       0     0     0

Important Notes

  • Resilver time depends on pool size and data amount (typically 1-2 hours per TB)
  • Pool remains online during resilver - system can be used normally
  • Performance impact during resilver is minimal but noticeable
  • Do not power off during resilver - let it complete
  • Test boot from new drive to ensure boot entries work correctly
  • Update your disk variables if you use them in scripts or documentation