Prerequisites
Ensure your system meets these requirements before starting the installation:
- System uses UEFI to boot
- System is x86_64 architecture
- Two identical SATA SSDs for OS and containers (fast pool)
- Network access for package installation
Live Environment Setup
Boot from Ubuntu Live USB and prepare the environment for installation.
Download and Boot
Download the latest Ubuntu Desktop Noble (24.04) Live image, write it to a USB drive and boot your system in EFI mode.
Optional: Enable Remote Access (Live Environment)
If you prefer to perform the installation remotely from another workstation, you can enable SSH access in the live environment:
# Set a password for the ubuntu user
passwd ubuntu
# Install OpenSSH server and tmux
sudo apt update
sudo apt install --yes openssh-server
# Find your IP address
ip addr show
# You can now connect from another machine:
# ssh ubuntu@<ip-address>
Disable Automount
Disable automatic mounting to prevent the desktop environment from interfering with disk operations:
gsettings set org.gnome.desktop.media-handling automount false
Open Root Shell
sudo -i
Install Required Tools
apt update
apt install --yes debootstrap gdisk zfsutils-linux
systemctl stop zed
Generate Host ID
zgenhostid -f
Disk Preparation
Identify disks using persistent paths and wipe all existing data to prepare for ZFS installation.
Identify Disks by ID
First, list all available disks by their persistent IDs:
# List all disk IDs
ls -la /dev/disk/by-id/ | grep -v part
# Show disk model and serial information
lsblk -o NAME,SIZE,MODEL,SERIAL
IMPORTANT: Always use /dev/disk/by-id/* paths for disk references. These paths are persistent across reboots,
unlike /dev/sdX which can change.
Set Disk Variables
Set your disk variables using the full /dev/disk/by-id/ paths found above:
# Set OS disk variables
# Replace these with YOUR actual disk IDs from the ls command above
export OS_DISK1="/dev/disk/by-id/nvme-Force_MP510_1919820500012769305E"
export OS_DISK2="/dev/disk/by-id/nvme-WD_BLACK_SN770_250GB_2346FX400125"
Note: The example paths above are placeholders. You must use the actual paths from your system.
Clear Disks
IMPORTANT: This will destroy all data on the selected disks. Verify your disk variables before proceeding.
# Clear any existing ZFS labels
zpool labelclear -f $OS_DISK1 2>/dev/null || true
zpool labelclear -f $OS_DISK2 2>/dev/null || true
# Stop any existing mdraid arrays
umount /boot/efi 2>/dev/null || true
mdadm --stop /dev/md127 2>/dev/null || true
mdadm --stop /dev/md/esp 2>/dev/null || true
# Clear mdraid metadata from partitions (if they exist)
mdadm --zero-superblock --force ${OS_DISK1}-part1 2>/dev/null || true
mdadm --zero-superblock --force ${OS_DISK2}-part1 2>/dev/null || true
# Wipe filesystem signatures
wipefs -a $OS_DISK1
wipefs -a $OS_DISK2
# TRIM/discard entire disk (SSD optimization)
# Note: blkdiscard may fail on NVMe drives in live environment, this is normal
blkdiscard -f $OS_DISK1 2>/dev/null || true
blkdiscard -f $OS_DISK2 2>/dev/null || true
# Zap GPT and MBR partition tables
sgdisk --zap-all $OS_DISK1
sgdisk --zap-all $OS_DISK2
# Verify disks are clean
lsblk -o NAME,SIZE,FSTYPE,LABEL $OS_DISK1 $OS_DISK2
Redundant ESP with mdraid
Create mirrored EFI System Partitions for boot redundancy using mdraid.
Create EFI System Partitions
Create 512MB EFI System Partitions on both SSDs:
OS_DISKS="$OS_DISK1 $OS_DISK2"
for disk in $OS_DISKS; do
sgdisk -n "1:1m:+512m" -t "1:ef00" "$disk"
done
Create mdraid Array for ESP
mdadm --create --verbose --level 1 --metadata 1.0 --homehost any --raid-devices 2 /dev/md/esp \
${OS_DISK1}-part1 ${OS_DISK2}-part1
mdadm --assemble --scan
mdadm --detail --scan >> /etc/mdadm.conf
Note: Metadata version 1.0 is used so that mdraid metadata is written to the end of each partition, allowing firmware to recognize each component as a valid EFI system partition.
Create ZFS Partitions on SSDs
# Create ZFS partitions on SSDs (remaining space after ESP)
for disk in $OS_DISKS; do
sgdisk -n "2:0:-8m" -t "2:bf00" "$disk"
done
partprobe
Storage Optimizations
Configure optimal I/O alignment for modern SSDs.
Understanding ashift
Modern drives (SSDs and HDDs manufactured after 2011) use 4096-byte (4K) physical sectors. ZFS uses the ashift
parameter to align I/O operations with the disk’s physical sector size for optimal performance.
This guide uses ashift=12 (4096 bytes) for all pools, which is optimal for modern storage devices
ZFS OS Pool Creation
Create an encrypted, mirrored ZFS pool for the operating system with optimized properties.
Prepare Encryption Key
Create a key file for pool encryption:
echo 'password' > /etc/zfs/zroot.key
chmod 000 /etc/zfs/zroot.key
Create OS Pool
zpool create -f \
-m none \
-O acltype=posixacl \
-o ashift=12 \
-O atime=off \
-o autotrim=on \
-o cachefile=none \
-O canmount=off \
-o compatibility=openzfs-2.2-linux \
-O compression=zstd \
-O dnodesize=auto \
-O encryption=aes-256-gcm \
-O keyformat=passphrase \
-O keylocation=file:///etc/zfs/zroot.key \
-O normalization=formD \
-O recordsize=16K \
-O relatime=off \
-O xattr=sa \
zroot mirror ${OS_DISK1}-part2 ${OS_DISK2}-part2
ZFS Filesystem Creation
Create only the essential filesystems needed to boot the system.
Root Filesystem Setup
Create the root container and boot environment:
zfs create -o canmount=off -o mountpoint=none zroot/ROOT
zfs create -o canmount=noauto -o mountpoint=/ zroot/ROOT/ubuntu
Keystore Setup
Create a dedicated dataset for encryption keys:
zfs create -o mountpoint=/etc/zfs/keys zroot/keystore
System Directories
Create optimized datasets for system directories:
zfs create -o mountpoint=/var zroot/ROOT/ubuntu/var
zfs create -o mountpoint=/var/cache -o recordsize=128K -o sync=disabled \
zroot/ROOT/ubuntu/var/cache
zfs create -o mountpoint=/var/lib -o recordsize=8K zroot/ROOT/ubuntu/var/lib
zfs create -o mountpoint=/var/log -o recordsize=128K -o logbias=throughput \
zroot/ROOT/ubuntu/var/log
zfs create -o mountpoint=/tmp -o recordsize=32K -o compression=lz4 -o devices=off -o exec=off \
-o setuid=off -o sync=disabled zroot/ROOT/ubuntu/tmp
zfs create -o mountpoint=/var/tmp -o recordsize=32K -o compression=lz4 -o devices=off -o exec=off \
-o setuid=off -o sync=disabled zroot/ROOT/ubuntu/var/tmp
User Data Setup
Create datasets for user home directories:
# User data container
zfs create -o mountpoint=/home -o recordsize=128K zroot/USERDATA
zfs create -o mountpoint=/root zroot/USERDATA/root
zfs create zroot/USERDATA/marc
Finalize and Mount
Configure ZFSBootMenu properties and mount all datasets:
# Set bootfs property for ZBM
zpool set bootfs=zroot/ROOT/ubuntu zroot
zfs set keylocation=file:///etc/zfs/keys/zroot.key zroot
zfs set org.zfsbootmenu:keysource=zroot/keystore zroot
# Export & reimport with different mount path
zpool export zroot
zpool import -N -R /mnt zroot
zfs load-key -L prompt zroot
# Import datasets in correct order
zfs mount zroot/ROOT/ubuntu
zfs mount zroot/keystore
zfs mount -a
# Update device symlinks
udevadm trigger
ZFS Dataset Properties Overview
| Dataset | canmount | mountpoint | recordsize | compression | Additional Properties |
|---|---|---|---|---|---|
| zroot/ROOT | off | none | (inherited) | (inherited) | - |
| zroot/ROOT/ubuntu | noauto | / | (inherited) | (inherited) | - |
| zroot/keystore | (inherited) | /etc/zfs/keys | (inherited) | (inherited) | readonly=on |
| zroot/ROOT/ubuntu/var | (inherited) | /var | (inherited) | (inherited) | - |
| zroot/ROOT/ubuntu/var/cache | (inherited) | /var/cache | 128K | (inherited) | sync=disabled |
| zroot/ROOT/ubuntu/var/lib | (inherited) | /var/lib | 8K | (inherited) | - |
| zroot/ROOT/ubuntu/var/log | (inherited) | /var/log | 128K | (inherited) | logbias=throughput |
| zroot/ROOT/ubuntu/tmp | (inherited) | /tmp | 32K | lz4 | devices=off, exec=off, setuid=off, sync=disabled |
| zroot/ROOT/ubuntu/var/tmp | (inherited) | /var/tmp | 32K | lz4 | devices=off, exec=off, setuid=off, sync=disabled |
| zroot/USERDATA | (inherited) | /home | 128K | (inherited) | - |
| zroot/USERDATA/root | (inherited) | /root | (inherited) | (inherited) | - |
| zroot/USERDATA/marc | (inherited) | /home/marc | (inherited) | (inherited) | - |
Ubuntu Installation
Install Ubuntu base system using debootstrap and configure essential system settings.
Install Base System
debootstrap noble /mnt
Copy Configuration Files
cp /etc/hostid /mnt/etc/hostid
cp /etc/mdadm.conf /mnt/etc/
cp /etc/resolv.conf /mnt/etc/resolv.conf
cp /etc/zfs/zroot.key /mnt/etc/zfs/keys/zroot.key
Chroot into New System
mount -t proc proc /mnt/proc
mount -t sysfs sys /mnt/sys
mount -B /dev /mnt/dev
mount -t devpts pts /mnt/dev/pts
chroot /mnt /bin/bash
Basic Ubuntu Configuration
Set hostname:
echo 'bender' > /etc/hostname
echo -e '127.0.1.1\tbender' >> /etc/hosts
Set root password:
passwd
Configure APT sources:
cat <<EOF > /etc/apt/sources.list
deb http://archive.ubuntu.com/ubuntu/ noble main restricted universe multiverse
deb http://archive.ubuntu.com/ubuntu/ noble-updates main restricted universe multiverse
deb http://archive.ubuntu.com/ubuntu/ noble-security main restricted universe multiverse
deb http://archive.ubuntu.com/ubuntu/ noble-backports main restricted universe multiverse
EOF
Update system and install base packages:
apt update
apt upgrade --yes
apt install --yes --no-install-recommends \
console-setup \
keyboard-configuration \
linux-generic-hwe-24.04 \
locales
Configure locales and timezone:
dpkg-reconfigure locales tzdata keyboard-configuration console-setup
Even if you prefer a non-English system language, always ensure that en_US.UTF-8 is available
Create User Account
Create a regular user account with sudo privileges:
# Create user (replace 'marc' with your desired username throughout)
adduser marc
cp -a /etc/skel/. /home/marc
chown -R marc:marc /home/marc
# Add user to supplementary groups
usermod -aG adm,plugdev,sudo marc
Note: You will be prompted to set a password and user information during adduser. Ignore any error messages
regarding existing home directory.
Configure Network
cat > /etc/netplan/01-enp4s0.yaml << 'EOF'
network:
version: 2
renderer: networkd
ethernets:
enp4s0:
dhcp4: true
dhcp6: true
accept-ra: true
ipv6-privacy: false
EOF
chown 600 /etc/netplan/01-enp4s0.yaml
netplan apply
ZFS Configuration
Install ZFS packages, enable required services, and configure encryption settings.
Install ZFS and Essential System Packages
apt install --yes dosfstools mdadm zfs-initramfs zfsutils-linux
Enable ZFS Services
systemctl enable zfs.target
systemctl enable zfs-mount
systemctl enable zfs-import.target
Secure the Keystore
Set the keystore to read-only for additional security:
zfs set readonly=on zroot/keystore
Configure initramfs for Encryption
echo "UMASK=0077" > /etc/initramfs-tools/conf.d/umask.conf
Rebuild initramfs
update-initramfs -c -k all
zfs set org.zfsbootmenu:commandline="quiet" zroot/ROOT
ZFSBootMenu Setup
Configure the EFI System Partition, install ZFSBootMenu, and create redundant UEFI boot entries.
Format and Mount ESP
mkfs.vfat -F32 -nBOOT /dev/md/esp
mkdir -p /boot/efi
mount -t vfat /dev/md/esp /boot/efi/
Add ESP to fstab
cat << EOF >> /etc/fstab
$( blkid | grep BOOT | cut -d ' ' -f 4 ) /boot/efi vfat defaults 0 0
EOF
Install ZFSBootMenu
Install curl and download ZFSBootMenu EFI image:
apt install --yes curl
mkdir -p /boot/efi/EFI/ZBM
curl -o /boot/efi/EFI/ZBM/VMLINUZ.EFI -L https://get.zfsbootmenu.org/efi
cp /boot/efi/EFI/ZBM/VMLINUZ.EFI /boot/efi/EFI/ZBM/VMLINUZ-BACKUP.EFI
Create UEFI Boot Entries
Mount EFI variables and install efibootmgr:
mount -t efivarfs efivarfs /sys/firmware/efi/efivars
apt install --yes efibootmgr
Create boot entries for both disks (redundancy):
# Disk 2 boot entries
efibootmgr -c -d "$OS_DISK2" -p 1 -L "ZBM 2 (Backup)" -l '\EFI\ZBM\VMLINUZ-BACKUP.EFI'
efibootmgr -c -d "$OS_DISK2" -p 1 -L "ZBM 2" -l '\EFI\ZBM\VMLINUZ.EFI'
# Disk 1 boot entries
efibootmgr -c -d "$OS_DISK1" -p 1 -L "ZBM 1 (Backup)" -l '\EFI\ZBM\VMLINUZ-BACKUP.EFI'
efibootmgr -c -d "$OS_DISK1" -p 1 -L "ZBM 1" -l '\EFI\ZBM\VMLINUZ.EFI'
Boot entry redundancy:
- Each disk has two boot entries (primary and backup)
- If one disk fails, system can boot from the other
- If primary boot image fails, backup provides recovery option
Verify boot entries:
efibootmgr -v
Final Steps
Clean up the installation environment, reboot into the new system, and verify everything works correctly.
Exit Chroot and Cleanup
# Exit the chroot environment
exit
# Unmount all filesystems
umount -n -R /mnt
# Export the pool
zpool export zroot
Reboot into New System
Remove the installation media and reboot:
reboot
First boot process:
- UEFI firmware loads ZFSBootMenu
- ZFSBootMenu prompts for encryption password
- ZFSBootMenu displays available boot environments
- Select
zroot/ROOT/ubuntuto boot
Post-Installation Verification
After first boot, verify the system:
# Check ZFS pool status
zpool status
# Verify all datasets are mounted
zfs list
# Check mdraid status for ESP
cat /proc/mdstat
# Verify network connectivity
ip addr show
ping -c3 google.com
# Check system services
systemctl status zfs-mount
Create Post-Installation Snapshot
Before creating snapshots, prevent GRUB packages from being installed when upgrading the kernel:
# Prevent GRUB updates that could interfere with ZFSBootMenu
tee /etc/apt/preferences.d/no-grub << 'EOF'
Package: grub* grub2*
Pin: release *
Pin-Priority: -1
EOF
Create a snapshot of the fresh installation for easy rollback:
# Snapshot essential system datasets
zfs snapshot zroot/ROOT/ubuntu@fresh-install
zfs snapshot zroot/ROOT/ubuntu/var@fresh-install
zfs snapshot zroot/ROOT/ubuntu/var/lib@fresh-install
# Verify snapshots
zfs list -t snapshot
Appendix 1: Recovery Mode
If you need to recover your system or make changes from a live environment, use these commands to mount and chroot back into your installed system.
When to Use Recovery Mode
- System won’t boot but pools are intact
- Need to fix boot configuration
- Want to update ZFSBootMenu
- Need to recover from failed updates
- Troubleshooting boot issues
Recovery Steps
Boot from the Ubuntu Live USB and open a root shell:
sudo -i
Install ZFS utilities if needed:
apt update
apt install --yes zfsutils-linux
Mount the ESP and import ZFS pools:
# Find the md device for ESP (it might have a different number)
cat /proc/mdstat
# Look for the md device with the ESP partitions
# If the array is not running, start it first
# Replace md127 with the actual md number from /proc/mdstat
mdadm --run /dev/md127
# Mount the ESP
mkdir -p /mnt/boot/efi/
mount -t vfat /dev/md/esp /mnt/boot/efi/
# Export any auto-imported pools
zpool export -a
# Import the zroot pool
zpool import -f -N -R /mnt zroot
# Load encryption keys
zfs load-key -L prompt zroot
# Mount filesystems
zfs mount zroot/ROOT/ubuntu
zfs mount zroot/keystore
zfs mount -a
# Mount proc, sys, dev for chroot
mount -t proc proc /mnt/proc
mount -t sysfs sys /mnt/sys
mount -B /dev /mnt/dev
mount -t devpts pts /mnt/dev/pts
# Enter the chroot environment
chroot /mnt /bin/bash
Exit Recovery Mode
When finished, clean up and reboot:
# Exit chroot
exit
# Unmount everything
umount -n -R /mnt
# Export pool
zpool export zroot
# Reboot
reboot
Appendix 2: Replacing a Faulted Drive
This guide covers how to replace a failed drive in your mirrored ZFS pool and mdraid ESP array.
When to Replace a Drive
Replace a drive when:
- ZFS pool shows DEGRADED status
- A drive shows as FAULTED or UNAVAIL in
zpool status - SMART errors indicate imminent drive failure
- The mdraid array shows a failed component
Check Pool and Array Status
First, identify which drive has failed:
# Check ZFS pool status
zpool status zroot
# Check mdraid array status
cat /proc/mdstat
# If the array is not running, start it first
# Replace md127 with the actual md number from /proc/mdstat
mdadm --run /dev/md127
mdadm --detail /dev/md/esp
Example output showing a faulted drive:
pool: zroot
state: DEGRADED
status: One or more devices could not be used.
scan: resilvered 245G in 01:23:45 with 0 errors
config:
NAME STATE READ WRITE CKSUM
zroot DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
nvme-Force_MP510_1919820500012769305E-part2 ONLINE 0 0 0
nvme-WD_BLACK_SN770_250GB_2346FX400125-part2 FAULTED 0 0 0
Identify the Failed Physical Drive
Find the disk-by-id path for reference:
# List disk IDs to identify replacement
ls -la /dev/disk/by-id/ | grep -v part
# Note the disk path of the failed drive
# Example: /dev/disk/by-id/nvme-WD_BLACK_SN770_250GB_2346FX400125
Physical Drive Replacement
- Shutdown the system:
shutdown -h now
- Replace the physical drive with a new one of equal or larger size
- Boot the system
- Identify the new drive:
# List available disks to find the new drive ID
ls -la /dev/disk/by-id/ | grep -v part
lsblk -o NAME,SIZE,MODEL,SERIAL
# Set variable for new disk (replace with your actual disk ID)
export NEW_DISK="/dev/disk/by-id/nvme-NEW_DRIVE_SERIAL_HERE"
Partition the New Drive
Match the partition layout of the working drive:
# Copy partition table from working drive to new drive
# Replace OS_DISK1 with the path of your WORKING drive
export WORKING_DISK="/dev/disk/by-id/nvme-Force_MP510_1919820500012769305E"
sgdisk --replicate=$NEW_DISK $WORKING_DISK
sgdisk --randomize-guids $NEW_DISK
# Verify partitions were created
lsblk $NEW_DISK
Alternative manual partitioning:
# If sgdisk --replicate fails, partition manually
sgdisk -n "1:1m:+512m" -t "1:ef00" "$NEW_DISK"
sgdisk -n "2:0:-8m" -t "2:bf00" "$NEW_DISK"
partprobe
Replace Drive in mdraid ESP Array
Add the new ESP partition to the mdraid array:
# Add new partition to mdraid array
mdadm --run /dev/md127
mdadm --manage /dev/md127 --add ${NEW_DISK}-part1
# Monitor rebuild progress
watch cat /proc/mdstat
# Wait for rebuild to complete (shows [UU] when done)
# This usually takes a few minutes for a 512MB partition
Replace Drive in ZFS Pool
Replace the faulted drive in the ZFS pool:
# Replace the faulted drive in ZFS pool
# Use the FAULTED disk path from 'zpool status' output
export OLD_DISK="/dev/disk/by-id/nvme-WD_BLACK_SN770_250GB_2346FX400125"
zpool replace zroot ${OLD_DISK}-part2 ${NEW_DISK}-part2
# Monitor resilver progress
watch zpool status zroot
Resilver progress example:
scan: resilver in progress since Sat Jan 01 12:34:56 2025
245G scanned at 2.1G/s, 123G issued at 1.8G/s, 245G total
123G resilvered, 50.2% done, 00:01:08 to go
Create UEFI Boot Entries for New Drive
Once resilvering is complete, add boot entries for the new drive:
# Mount EFI variables if not already mounted
mount -t efivarfs efivarfs /sys/firmware/efi/efivars 2>/dev/null || true
# Create boot entries for new disk
efibootmgr -c -d "$NEW_DISK" -p 1 \
-L "ZBM NEW (Backup)" \
-l '\EFI\ZBM\VMLINUZ-BACKUP.EFI'
efibootmgr -c -d "$NEW_DISK" -p 1 \
-L "ZBM NEW" \
-l '\EFI\ZBM\VMLINUZ.EFI'
# Verify boot entries
efibootmgr -v
Optional: Remove old boot entries for the failed drive:
# List boot entries to find old entries
efibootmgr -v
# Delete old entries by boot number (e.g., Boot0003)
# efibootmgr -b 0003 -B
Verify Replacement
After resilver completes, verify everything is working:
# Check pool status (should show ONLINE)
zpool status zroot
# Check mdraid status (should show [UU])
cat /proc/mdstat
mdadm --detail /dev/md/esp
# Verify all datasets are accessible
zfs list
# Check SMART status of new drive
smartctl -a $NEW_DISK
# Verify boot entries
efibootmgr -v
Expected pool status after successful replacement:
pool: zroot
state: ONLINE
scan: resilvered 245G in 01:23:45 with 0 errors
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvme-Force_MP510_1919820500012769305E-part2 ONLINE 0 0 0
nvme-NEW_DRIVE_SERIAL_HERE-part2 ONLINE 0 0 0
Important Notes
- Resilver time depends on pool size and data amount (typically 1-2 hours per TB)
- Pool remains online during resilver - system can be used normally
- Performance impact during resilver is minimal but noticeable
- Do not power off during resilver - let it complete
- Test boot from new drive to ensure boot entries work correctly
- Update your disk variables if you use them in scripts or documentation