How I Built a 40TB ZFS NAS and Finally Got Backups Right

Let me tell you the story of how I lost a weekend project that I'd spent 30 hours on.

It was a Saturday morning. I was working on a Terraform module for Kumari.ai's staging environment, and I accidentally ran

CODE

1 line

terraform destroy

on the wrong workspace. The statefile was gone, the infrastructure was torn down, and the code only existed on the Proxmox VM I was working in. A VM that I had never backed up, because I kept telling myself I'd "set up backups this weekend."

I recovered most of the code from my shell history and editor swap files. But "most" isn't "all," and I spent the next two days rewriting what I'd lost from memory.

That was the last time I lost data. The next week, I built a NAS and implemented a real backup strategy.

The Hardware

I wanted a dedicated machine for storage — not another VM on the Proxmox cluster, and not a USB drive hanging off one of the OptiPlexes. A dedicated NAS means:

Separate failure domain from compute (a Proxmox crash doesn't take backups with it)
Proper storage controllers and ECC RAM
Always-on, low-power operation

I found a Dell OptiPlex 7040 Tower on eBay for $85. The tower form factor (not SFF) is important because it has room for additional drives:


CODE
7 lines
1Machine:     Dell OptiPlex 7040 Tower
2CPU:         Intel Xeon E3-1230 v3 (4C/8T, 3.3GHz)
3RAM:         32GB DDR3 ECC (upgraded from 8GB for $25)
4Boot drive:  256GB SATA SSD (came with the machine)
5OS:          Debian 12 (Bookworm)
6HBA:         Dell H310 IT mode (flashed) — $25 on eBay
7Cost:        $85 machine + $25 RAM + $25 HBA = $135

important

[!IMPORTANT] ECC RAM is non-negotiable for a ZFS NAS. ZFS keeps a lot of data in the ARC (Adaptive Replacement Cache) in RAM. If a bit flip corrupts data in the ARC, ZFS might write that corruption to disk and checksum it as valid. ECC memory prevents this. I specifically chose a Xeon E3 + C226 chipset combination because it supports ECC, unlike the consumer i5/i7 counterparts in the same Dell chassis.

The Drives

I went with 6x 8TB WD Red Plus (WD80EFPX) drives:


CODE
5 lines
16x WD Red Plus 8TB (WD80EFPX)    — $120 each = $720
2  CMR (not SMR — critical for ZFS)
3  5640 RPM, 256MB cache
4  Rated for 24/7 NAS use
5  3-year warranty

warning

[!WARNING] Do NOT buy SMR (Shingled Magnetic Recording) drives for a ZFS pool. WD infamously shipped SMR drives under the "Red" branding (WD80EFAX) without disclosing it. SMR drives have horrific random write performance that causes ZFS resilver times to balloon from hours to days. Always verify you're getting CMR drives. The WD Red Plus line is always CMR. Check the model number carefully.

Total NAS cost: $135 (machine) + $720 (drives) = $855

ZFS: Why Not Hardware RAID?

I've used hardware RAID controllers before (the PERC H710P in my R720, for example). They work fine for Proxmox with Ceph. But for a NAS, ZFS is in a completely different league:

End-to-end checksumming: ZFS checksums every block and verifies on read. If a bit rots on disk, ZFS detects it and self-heals from the redundancy.
Copy-on-write: Data is never overwritten in place. This means snapshots are instant and free.
No write hole: RAIDZ doesn't suffer from the RAID-5 write hole problem.
Compression: Transparent lz4 compression saves space with near-zero CPU overhead.
Send/receive: Incremental replication of datasets — critical for off-site backups.

The Dell H310 HBA (flashed to IT mode) presents the drives directly to the OS without any RAID layer, which is exactly what ZFS needs.

Creating the Pool


Bash
24 lines
1# Identify the drives by serial number (not /dev/sdX — those can change)
2root@nas:~# ls -la /dev/disk/by-id/ | grep -v part
3ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX01 -> ../../sda
4ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX02 -> ../../sdb
5ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX03 -> ../../sdc
6ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX04 -> ../../sdd
7ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX05 -> ../../sde
8ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX06 -> ../../sdf
9
10# Create RAIDZ2 pool (can lose any 2 drives)
11zpool create -f \
12    -o ashift=12 \
13    -O compression=lz4 \
14    -O atime=off \
15    -O xattr=sa \
16    -O acltype=posixacl \
17    -O mountpoint=/tank \
18    tank raidz2 \
19    /dev/disk/by-id/ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX01 \
20    /dev/disk/by-id/ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX02 \
21    /dev/disk/by-id/ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX03 \
22    /dev/disk/by-id/ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX04 \
23    /dev/disk/by-id/ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX05 \
24    /dev/disk/by-id/ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX06

tip

[!TIP] Always use /dev/disk/by-id/ paths for ZFS. The /dev/sdX names can change between reboots depending on the order drives are detected. If the pool config references /dev/sda but that drive becomes /dev/sdc after a reboot, ZFS can get confused. The by-id paths are stable.


Bash
21 lines
1# Verify the pool
2root@nas:~# zpool status tank
3  pool: tank
4 state: ONLINE
5config:
6
7        NAME                                         STATE     READ WRITE CKSUM
8        tank                                         ONLINE       0     0     0
9          raidz2-0                                   ONLINE       0     0     0
10            ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX01  ONLINE       0     0     0
11            ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX02  ONLINE       0     0     0
12            ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX03  ONLINE       0     0     0
13            ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX04  ONLINE       0     0     0
14            ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX05  ONLINE       0     0     0
15            ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX06  ONLINE       0     0     0
16
17root@nas:~# zpool list tank
18NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
19tank  43.6T  14.2T  29.4T        -         -     6%    32%  1.00x    ONLINE  -
20
21# Usable after RAIDZ2 overhead: ~32TB

RAIDZ2 means I can lose any two drives simultaneously and not lose data. With 6 drives in the pool, that's a very comfortable margin.

Creating Datasets

ZFS datasets are like lightweight filesystems within the pool. Each can have its own properties (compression, quotas, snapshot schedules):


Bash
24 lines
1# Create datasets for different purposes
2zfs create tank/backups        # Proxmox Backup Server
3zfs create tank/media          # Movies, music, photos
4zfs create tank/isos           # ISO images for Proxmox
5zfs create tank/docker-volumes # Persistent Docker data
6zfs create tank/timemachine    # macOS Time Machine backups
7
8# Set quotas where appropriate
9zfs set quota=20T tank/media
10zfs set quota=8T tank/backups
11zfs set quota=500G tank/isos
12
13# Enable different compression for media (already compressed)
14zfs set compression=off tank/media
15
16# Check it all
17root@nas:~# zfs list -o name,used,avail,refer,compressratio
18NAME                  USED  AVAIL     REFER  RATIO
19tank                 14.2T  17.8T      256K  1.42x
20tank/backups          2.1T  5.9T      2.1T  1.68x
21tank/docker-volumes   180G  17.8T      180G  1.45x
22tank/isos             89G   411G       89G  1.02x
23tank/media           11.4T  8.6T     11.4T  1.00x
24tank/timemachine      340G  17.8T      340G  1.52x

That 1.42x overall compression ratio means I'm getting about 42% more effective storage than the raw capacity, just from lz4 compression. For text-heavy data like backups and logs, the ratio is even better (1.68x on the backups dataset).

Proxmox Backup Server

ZFS NAS architecture with PBS, network shares, and 3-2-1 backup strategy

Here's where it all comes together. Proxmox Backup Server (PBS) is a dedicated backup solution that integrates seamlessly with Proxmox VE. It supports:

Full VM and container backups (vzdump)
Incremental backups (only changed blocks are transferred)
Client-side deduplication (massive space savings)
Encryption (for off-site storage)
Integrity verification (periodic verify jobs)

I run PBS as an LXC container on the NAS itself, with the

CODE

1 line

/tank/backups

dataset mounted inside:


Bash
7 lines
1# On the Proxmox cluster, add the PBS storage
2# Datacenter → Storage → Add → Proxmox Backup Server
3
4Server:     10.10.10.20
5Username:   backup@pbs
6Datastore:  main
7Fingerprint: (from PBS web UI)

Backup Schedule


YAML
23 lines
1# vzdump backup jobs configured in Proxmox
2# Datacenter → Backup → Add
3
4# Critical VMs - daily at 01:00
5Schedule:    daily 01:00
6Selection:   pfsense, nginx-proxy, gitea, docker-host, openclaw
7Mode:        snapshot
8Retention:   daily=7, weekly=4, monthly=6
9Storage:     pbs-nas
10
11# Development VMs - daily at 01:30
12Schedule:    daily 01:30
13Selection:   dev-ubuntu, jenkins
14Mode:        snapshot
15Retention:   daily=3, weekly=2
16Storage:     pbs-nas
17
18# Lab VMs - weekly on Sunday
19Schedule:    sun 02:00
20Selection:   kali (not metasploitable/dvwa - they're disposable)
21Mode:        snapshot
22Retention:   weekly=2
23Storage:     pbs-nas

The deduplication in PBS is impressive. My 23 VMs and 8 LXC containers have a combined raw disk footprint of about 800GB. After a month of daily backups with the retention policy above, PBS uses only 2.1TB. That's all the daily/weekly/monthly snapshots for every VM, deduplicated down to 2.1TB.


Bash
7 lines
1# PBS storage usage
2root@pbs:~# proxmox-backup-manager datastore list
3┌──────┬──────────────────┬──────────┬───────────────┬───────────┐
4│ name │ path             │ comment  │ used          │ available │
5├──────┼──────────────────┼──────────┼───────────────┼───────────┤
6│ main │ /tank/backups    │          │ 2.1 TiB (26%) │ 5.9 TiB  │
7└──────┴──────────────────┴──────────┴───────────────┴───────────┘

The 3-2-1 Rule

The 3-2-1 backup rule says:

3 copies of your data
On 2 different types of media
With 1 copy off-site

Here's how my implementation maps to this:

Copy	What	Where	Media
1	Live data	Proxmox cluster (local-lvm + Ceph)	SSD
2	PBS backups	ZFS NAS (RAIDZ2)	HDD
3	Off-site	Backblaze B2 (encrypted)	Cloud

Off-site Backups with rclone

For the off-site copy, I use rclone to sync encrypted backups to Backblaze B2. B2 is dirt cheap: $0.005/GB/month for storage, $0.01/GB for downloads. My 2.1TB of PBS backups cost about $10.50/month to store off-site.


Bash
16 lines
1# Configure rclone with Backblaze B2
2rclone config
3# → New remote → Backblaze B2 → Enter app key ID and app key
4# → Encrypt with crypt remote on top
5
6# The rclone config looks like:
7# [b2]
8# type = b2
9# account = <app-key-id>
10# key = <app-key>
11#
12# [b2-crypt]
13# type = crypt
14# remote = b2:resham-homelab-backups
15# password = <encrypted>
16# password2 = <encrypted>

The encryption is critical. These backups contain VM disk images, which have database files, SSH keys, and other sensitive data inside them. Everything is encrypted before it leaves my network.


Bash
27 lines
1# Weekly sync script
2#!/bin/bash
3# /usr/local/bin/offsite-backup.sh
4
5set -euo pipefail
6
7LOG="/var/log/offsite-backup.log"
8TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
9
10echo "[$TIMESTAMP] Starting off-site backup sync" >> $LOG
11
12# Sync PBS datastore to B2 (encrypted)
13rclone sync /tank/backups b2-crypt:pbs-backups \
14    --transfers 4 \
15    --checkers 8 \
16    --b2-hard-delete \
17    --log-file $LOG \
18    --log-level INFO \
19    --stats 1m
20
21TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
22echo "[$TIMESTAMP] Off-site backup sync complete" >> $LOG
23
24# Send notification
25curl -s -o /dev/null -X POST "$SLACK_WEBHOOK" \
26    -H "Content-Type: application/json" \
27    -d "{\"text\": \"Off-site backup sync completed at $TIMESTAMP\"}"


Bash
2 lines
1# Cron job: every Sunday at 04:00
20 4 * * 0 /usr/local/bin/offsite-backup.sh

ZFS Maintenance

ZFS needs regular maintenance to stay healthy:

Scrubs

A scrub reads every block on every disk and verifies the checksums. If it finds a bad block, it automatically repairs it from the RAIDZ2 redundancy. I run scrubs weekly:


Bash
6 lines
1# /etc/cron.d/zfs-scrub
20 2 * * 0 root zpool scrub tank
3
4# Check scrub status
5root@nas:~# zpool status tank | grep -A 3 scan
6  scan: scrub repaired 0B in 08:42:15 with 0 errors on Sun Mar  9 10:42:15 2025

Eight hours for a 14TB scrub across 6 drives. Not fast, but it runs at 2am when nothing else is happening.

SMART Monitoring


Bash
19 lines
1# Install smartmontools
2apt install smartmontools
3
4# Enable SMART on all drives
5for d in /dev/sd{a..f}; do smartctl -s on $d; done
6
7# Check health
8root@nas:~# for d in /dev/sd{a..f}; do
9    echo "=== $d ==="
10    smartctl -H $d | grep "SMART overall"
11    smartctl -A $d | grep -E "(Reallocated|Current_Pending|Offline_Uncorrectable)"
12done
13
14=== /dev/sda ===
15SMART overall-health self-assessment test result: PASSED
16  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
17197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       0
18198 Offline_Uncorrectable   0x0010   200   200   000    Old_age   Offline      -       0
19# ... (same for all drives — all zeros. Healthy.)

The three SMART attributes I watch: Reallocated Sector Count, Current Pending Sector, and Offline Uncorrectable. If any of these go above zero, the drive is starting to fail and it's time to order a replacement.

I have a Prometheus exporter for SMART data and a Grafana alert that fires if any of these values increment. So far: zero alerts in a year of operation.

ZFS Snapshots

Beyond PBS backups, I use ZFS snapshots for quick local rollback:


Bash
16 lines
1# Automated snapshot schedule using zfs-auto-snapshot
2apt install zfs-auto-snapshot
3
4# Default schedule:
5# - Frequent (every 15 min): keep 4
6# - Hourly: keep 24
7# - Daily: keep 30
8# - Weekly: keep 8
9# - Monthly: keep 12
10
11root@nas:~# zfs list -t snapshot -o name,used,creation | head -10
12NAME                                   USED  CREATION
13tank/backups@zfs-auto-snap_hourly-...  128K  Thu Mar  6  9:00 2025
14tank/backups@zfs-auto-snap_hourly-...  256K  Thu Mar  6 10:00 2025
15tank/media@zfs-auto-snap_daily-...     0B    Thu Mar  6  0:00 2025
16tank/docker-volumes@zfs-auto-snap...   4.2M  Thu Mar  6  9:00 2025

Snapshots are nearly free in ZFS because of copy-on-write. A snapshot doesn't duplicate data — it just prevents the blocks from being freed when they're modified. The "USED" column shows how much data has changed since the snapshot was taken.

The Restore Test That Saved Me

A month after setting everything up, I decided to test a full restore. I picked a random VM (the Gitea instance), deleted it from Proxmox, and restored it from PBS.


Bash
8 lines
1# On the Proxmox web UI:
2# Datacenter → Storage → pbs-nas → Content → Select Gitea backup → Restore
3
4# Or via CLI:
5qmrestore pbs:backup/vm/102/2025-03-01T01:00:22Z 102
6
7# Time to restore a 32GB VM: 4 minutes 12 seconds
8# Gitea was back online with all repositories intact.

Four minutes from "VM deleted" to "fully running again." That's the difference between "I have backups" and "my backups actually work." If you take nothing else from this post: test your restores regularly.

I now test a random VM restore on the first of every month. It's in my calendar. Non-negotiable.

Network Shares

The NAS also serves as a general-purpose file server:


Bash
19 lines
1# /etc/exports (NFS for Proxmox and Linux clients)
2/tank/isos     10.10.10.0/24(ro,sync,no_subtree_check)
3/tank/docker-volumes  10.10.50.10/32(rw,sync,no_subtree_check,no_root_squash)
4
5# /etc/samba/smb.conf (SMB for the media library)
6[media]
7    path = /tank/media
8    browseable = yes
9    read only = yes
10    guest ok = yes
11    valid users = resham
12
13[timemachine]
14    path = /tank/timemachine
15    browseable = yes
16    read only = no
17    fruit:time machine = yes
18    fruit:time machine max size = 500G
19    vfs objects = catia fruit streams_xattr

The NFS share for ISOs is mounted on all Proxmox nodes, so I can install any OS on any VM without uploading the ISO to each node individually.

Power Consumption and Reliability

The NAS draws about 45W at idle (drives spinning) and 65W under heavy I/O (scrub or backup job). Monthly power cost: about $3.90 at $0.12/kWh.

Uptime as of today:


Bash
2 lines
1root@nas:~# uptime
2 10:23:45 up 247 days, 14:03,  2 users,  load average: 0.42, 0.38, 0.35

247 days. Zero unplanned reboots. Zero disk failures. Zero data corruption events (verified by weekly scrubs). ZFS on Debian is boring in the best possible way.

What This Cost Me

Item	Cost
Dell OptiPlex 7040 Tower	$85
RAM upgrade (32GB ECC)	$25
Dell H310 HBA (IT mode)	$25
6x 8TB WD Red Plus	$720
SATA cables + caddy	$15
Hardware total	$870
Backblaze B2 (monthly)	~$10.50
Power (monthly)	~$3.90

For $870 upfront and $14.40/month ongoing, I have:

32TB of usable, redundant, checksummed storage
Automated daily backups of every VM in my homelab
Off-site encrypted backups in the cloud
Network file shares for the entire homelab
Peace of mind that my data is safe

✦

If you're running a homelab without proper backups — and I know some of you are, because I was one of you — please set this up. It doesn't have to be this exact configuration. A single external USB drive with a weekly rsync script is infinitely better than nothing. But if you have the budget, a dedicated ZFS NAS with PBS is one of the best investments you can make.

RAID is not a backup. Snapshots are not a backup. The only backup that counts is the one on a separate machine, preferably with a copy off-site, that you've tested restoring from. Build that, and you'll never have to rewrite 30 hours of lost work from memory again.