ZFS Disaster Recovery: Rebuilding and Mirroring a Pool After Top-Level Vdev Error

I recently learned a hard lesson about ZFS Vdev architecture after attempting to convert a single-disk pool into a mirror. By mistake, I added the new disk as a top-level Vdev, rather than attaching it as a mirror. As zpool remove and zpool detach both failed on the top-level Vdev, I was forced to destroy the pool and restore the data from a snapshot.

This process outlines how I recovered data and subsequently created a proper mirror configuration.

Part I: Identifying the Error and Creating the Snapshot

The critical error stemmed from a fundamental ZFS design principle: Top-level Vdevs cannot be removed from a pool.

root@debian:~# zpool remove zboot /dev/disk/by-id/ata-ADATA_SP900_xxx-part3
cannot remove /dev/disk/by-id/ata-ADATA_SP900_xxx-part3: operation not supported on this type of pool

Since the pool was corrupted by the architectural error, the only path forward was to snapshot the data and migrate it.

Creating and Securing the Backup

I created a recursive snapshot of the pool (zboot) and piped the stream through gzip for compression.

CRITICAL: The backup file must be stored on a separate, non-dependent pool or external disk.

# Snapshot the pool recursively
zfs snapshot -r zboot@backup

# Send the recursive snapshot and compress it (Example output)
root@debian:/home/jean# zfs send -cvR zboot@backup | gzip > zboot.backup.gz
full send of zboot/BOOT/debian@backup estimated size is 271M
00:11:13 258M zboot/BOOT/debian@backup 

Part II: Pool Destruction, Rebuilding, and Restoration

Pool Recreation

After confirming the backup, I destroyed the corrupted pool (zpool destroy zboot) and immediately rebuilt it using optimal ZFS-on-Root options.

# Example ZFS Pool Creation (Note: I avoided compression here due to GRUB compatibility issues)
root@debian:/home/jean# zpool create \
-o ashift=12 \
-o autotrim=on \
-o compatibility=grub2 \
-O xattr=sa \
-O compression=off \
-O normalization=formD \
-O canmount=off -O mountpoint=/boot zboot /dev/disk/by-id/ata-ADATA_SP900_xxx-part3

Data Restoration

The data is restored by piping the compressed stream back into the new pool. The -d flag is crucial as it instructs zfs recv to create the datasets directly under the target pool (zboot).

root@debian:/home/jean# gzcat zboot.backup.gz | zfs recv -v -d -F -o compression=off zboot
receiving full stream of zboot/BOOT/debian@backup into zboot/BOOT/debian@backup
received 272M stream in 14.81 seconds (18.3M/sec)

Part III: Converting to Mirror and Final Hardening

Attaching the Mirror Disk

The new pool is online with the single disk. I now add the second disk using zpool attach to create the desired mirror.

root@debian:/home/jean# zpool attach zboot /dev/disk/by-id/ata-ADATA_SP600_xxx-part3 /dev/disk/by-id/ata-ADATA_SP900_xxx-part3

A quick zpool status confirms the second disk is now part of the mirror and the resilver (data synchronization) is complete.

pool: zboot
state: ONLINE
scan: resilvered 274M in 00:00:04 with 0 errors on Tue Jun 11 00:34:00 2024
config:
NAME                                    STATE     READ WRITE CKSUM
zboot                                   ONLINE       0     0     0
  mirror-0                              ONLINE       0     0     0
    ata-ADATA_SP600_xxx-part3           ONLINE       0     0     0
    ata-ADATA_SP900_xxx-part3           ONLINE       0     0     0

Fixing Device Naming (Persistence)

A common pitfall is ZFS referencing disks by the ephemeral /dev/sdX naming convention instead of the persistent /dev/disk/by-id/ path. This can lead to boot failure.

If a device is incorrectly named (e.g., /dev/sdd5), the solution is to detach it and re-attach it using the correct, persistent ID:

root@debian:/etc/zfs# zpool detach zroot /dev/sdd5
root@debian:/etc/zfs# zpool attach zroot /dev/disk/by-id/ata-ADATA_SP900_xxx-part5 /dev/disk/by-id/ata-ADATA_SP600_xxx-part5

A resilver will start, guaranteeing persistence across reboots.

Sources / See Also

  • OpenZFS Documentation. Zpool Command Reference (zpool create, attach, remove). https://openzfs.github.io/openzfs-docs/man/8/zpool.8.html
  • OpenZFS Documentation. ZFS Command Reference (zfs send, zfs recv). https://openzfs.github.io/openzfs-docs/man/8/zfs.8.html
  • Oracle Documentation. ZFS Vdev and Pool Architecture (Understanding Top-Level Vdevs). https://docs.oracle.com/cd/E19253-01/html/819-5461/ghxay.html
  • GNU GRUB Manual. Filesystem Support and Compatibility (Specific ZFS Features). https://www.gnu.org/software/grub/manual/grub/html_node/ZFS.html
  • Debian Wiki. ZFS on Root Guide (Common Installation Pitfalls). https://wiki.debian.org/ZFSOnRoot

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.