Let me tell you the story of how I lost a weekend project that I'd spent 30 hours on.
It was a Saturday morning. I was working on a Terraform module for Kumari.ai's staging environment, and I accidentally ran
terraform destroyI recovered most of the code from my shell history and editor swap files. But "most" isn't "all," and I spent the next two days rewriting what I'd lost from memory.
That was the last time I lost data. The next week, I built a NAS and implemented a real backup strategy.
The Hardware
I wanted a dedicated machine for storage — not another VM on the Proxmox cluster, and not a USB drive hanging off one of the OptiPlexes. A dedicated NAS means:
- Separate failure domain from compute (a Proxmox crash doesn't take backups with it)
- Proper storage controllers and ECC RAM
- Always-on, low-power operation
I found a Dell OptiPlex 7040 Tower on eBay for $85. The tower form factor (not SFF) is important because it has room for additional drives:
CODE7 lines1Machine: Dell OptiPlex 7040 Tower 2CPU: Intel Xeon E3-1230 v3 (4C/8T, 3.3GHz) 3RAM: 32GB DDR3 ECC (upgraded from 8GB for $25) 4Boot drive: 256GB SATA SSD (came with the machine) 5OS: Debian 12 (Bookworm) 6HBA: Dell H310 IT mode (flashed) — $25 on eBay 7Cost: $85 machine + $25 RAM + $25 HBA = $135
The Drives
I went with 6x 8TB WD Red Plus (WD80EFPX) drives:
CODE5 lines16x WD Red Plus 8TB (WD80EFPX) — $120 each = $720 2 CMR (not SMR — critical for ZFS) 3 5640 RPM, 256MB cache 4 Rated for 24/7 NAS use 5 3-year warranty
Total NAS cost: $135 (machine) + $720 (drives) = $855
ZFS: Why Not Hardware RAID?
I've used hardware RAID controllers before (the PERC H710P in my R720, for example). They work fine for Proxmox with Ceph. But for a NAS, ZFS is in a completely different league:
- End-to-end checksumming: ZFS checksums every block and verifies on read. If a bit rots on disk, ZFS detects it and self-heals from the redundancy.
- Copy-on-write: Data is never overwritten in place. This means snapshots are instant and free.
- No write hole: RAIDZ doesn't suffer from the RAID-5 write hole problem.
- Compression: Transparent lz4 compression saves space with near-zero CPU overhead.
- Send/receive: Incremental replication of datasets — critical for off-site backups.
The Dell H310 HBA (flashed to IT mode) presents the drives directly to the OS without any RAID layer, which is exactly what ZFS needs.
Creating the Pool
Bash24 lines1# Identify the drives by serial number (not /dev/sdX — those can change) 2root@nas:~# ls -la /dev/disk/by-id/ | grep -v part 3ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX01 -> ../../sda 4ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX02 -> ../../sdb 5ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX03 -> ../../sdc 6ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX04 -> ../../sdd 7ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX05 -> ../../sde 8ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX06 -> ../../sdf 9 10# Create RAIDZ2 pool (can lose any 2 drives) 11zpool create -f \ 12 -o ashift=12 \ 13 -O compression=lz4 \ 14 -O atime=off \ 15 -O xattr=sa \ 16 -O acltype=posixacl \ 17 -O mountpoint=/tank \ 18 tank raidz2 \ 19 /dev/disk/by-id/ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX01 \ 20 /dev/disk/by-id/ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX02 \ 21 /dev/disk/by-id/ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX03 \ 22 /dev/disk/by-id/ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX04 \ 23 /dev/disk/by-id/ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX05 \ 24 /dev/disk/by-id/ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX06
Bash21 lines1# Verify the pool 2root@nas:~# zpool status tank 3 pool: tank 4 state: ONLINE 5config: 6 7 NAME STATE READ WRITE CKSUM 8 tank ONLINE 0 0 0 9 raidz2-0 ONLINE 0 0 0 10 ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX01 ONLINE 0 0 0 11 ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX02 ONLINE 0 0 0 12 ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX03 ONLINE 0 0 0 13 ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX04 ONLINE 0 0 0 14 ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX05 ONLINE 0 0 0 15 ata-WDC_WD80EFPX-68C4ZN0_WD-CA0XXXXX06 ONLINE 0 0 0 16 17root@nas:~# zpool list tank 18NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT 19tank 43.6T 14.2T 29.4T - - 6% 32% 1.00x ONLINE - 20 21# Usable after RAIDZ2 overhead: ~32TB
RAIDZ2 means I can lose any two drives simultaneously and not lose data. With 6 drives in the pool, that's a very comfortable margin.
Creating Datasets
ZFS datasets are like lightweight filesystems within the pool. Each can have its own properties (compression, quotas, snapshot schedules):
Bash24 lines1# Create datasets for different purposes 2zfs create tank/backups # Proxmox Backup Server 3zfs create tank/media # Movies, music, photos 4zfs create tank/isos # ISO images for Proxmox 5zfs create tank/docker-volumes # Persistent Docker data 6zfs create tank/timemachine # macOS Time Machine backups 7 8# Set quotas where appropriate 9zfs set quota=20T tank/media 10zfs set quota=8T tank/backups 11zfs set quota=500G tank/isos 12 13# Enable different compression for media (already compressed) 14zfs set compression=off tank/media 15 16# Check it all 17root@nas:~# zfs list -o name,used,avail,refer,compressratio 18NAME USED AVAIL REFER RATIO 19tank 14.2T 17.8T 256K 1.42x 20tank/backups 2.1T 5.9T 2.1T 1.68x 21tank/docker-volumes 180G 17.8T 180G 1.45x 22tank/isos 89G 411G 89G 1.02x 23tank/media 11.4T 8.6T 11.4T 1.00x 24tank/timemachine 340G 17.8T 340G 1.52x
That 1.42x overall compression ratio means I'm getting about 42% more effective storage than the raw capacity, just from lz4 compression. For text-heavy data like backups and logs, the ratio is even better (1.68x on the backups dataset).
Proxmox Backup Server
Here's where it all comes together. Proxmox Backup Server (PBS) is a dedicated backup solution that integrates seamlessly with Proxmox VE. It supports:
- Full VM and container backups (vzdump)
- Incremental backups (only changed blocks are transferred)
- Client-side deduplication (massive space savings)
- Encryption (for off-site storage)
- Integrity verification (periodic verify jobs)
I run PBS as an LXC container on the NAS itself, with the
/tank/backupsBash7 lines1# On the Proxmox cluster, add the PBS storage 2# Datacenter → Storage → Add → Proxmox Backup Server 3 4Server: 10.10.10.20 5Username: backup@pbs 6Datastore: main 7Fingerprint: (from PBS web UI)
Backup Schedule
YAML23 lines1# vzdump backup jobs configured in Proxmox 2# Datacenter → Backup → Add 3 4# Critical VMs - daily at 01:00 5Schedule: daily 01:00 6Selection: pfsense, nginx-proxy, gitea, docker-host, openclaw 7Mode: snapshot 8Retention: daily=7, weekly=4, monthly=6 9Storage: pbs-nas 10 11# Development VMs - daily at 01:30 12Schedule: daily 01:30 13Selection: dev-ubuntu, jenkins 14Mode: snapshot 15Retention: daily=3, weekly=2 16Storage: pbs-nas 17 18# Lab VMs - weekly on Sunday 19Schedule: sun 02:00 20Selection: kali (not metasploitable/dvwa - they're disposable) 21Mode: snapshot 22Retention: weekly=2 23Storage: pbs-nas
The deduplication in PBS is impressive. My 23 VMs and 8 LXC containers have a combined raw disk footprint of about 800GB. After a month of daily backups with the retention policy above, PBS uses only 2.1TB. That's all the daily/weekly/monthly snapshots for every VM, deduplicated down to 2.1TB.
Bash7 lines1# PBS storage usage 2root@pbs:~# proxmox-backup-manager datastore list 3┌──────┬──────────────────┬──────────┬───────────────┬───────────┐ 4│ name │ path │ comment │ used │ available │ 5├──────┼──────────────────┼──────────┼───────────────┼───────────┤ 6│ main │ /tank/backups │ │ 2.1 TiB (26%) │ 5.9 TiB │ 7└──────┴──────────────────┴──────────┴───────────────┴───────────┘
The 3-2-1 Rule
The 3-2-1 backup rule says:
- 3 copies of your data
- On 2 different types of media
- With 1 copy off-site
Here's how my implementation maps to this:
| Copy | What | Where | Media |
|---|---|---|---|
| 1 | Live data | Proxmox cluster (local-lvm + Ceph) | SSD |
| 2 | PBS backups | ZFS NAS (RAIDZ2) | HDD |
| 3 | Off-site | Backblaze B2 (encrypted) | Cloud |
Off-site Backups with rclone
For the off-site copy, I use rclone to sync encrypted backups to Backblaze B2. B2 is dirt cheap: $0.005/GB/month for storage, $0.01/GB for downloads. My 2.1TB of PBS backups cost about $10.50/month to store off-site.
Bash16 lines1# Configure rclone with Backblaze B2 2rclone config 3# → New remote → Backblaze B2 → Enter app key ID and app key 4# → Encrypt with crypt remote on top 5 6# The rclone config looks like: 7# [b2] 8# type = b2 9# account = <app-key-id> 10# key = <app-key> 11# 12# [b2-crypt] 13# type = crypt 14# remote = b2:resham-homelab-backups 15# password = <encrypted> 16# password2 = <encrypted>
The encryption is critical. These backups contain VM disk images, which have database files, SSH keys, and other sensitive data inside them. Everything is encrypted before it leaves my network.
Bash27 lines1# Weekly sync script 2#!/bin/bash 3# /usr/local/bin/offsite-backup.sh 4 5set -euo pipefail 6 7LOG="/var/log/offsite-backup.log" 8TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S') 9 10echo "[$TIMESTAMP] Starting off-site backup sync" >> $LOG 11 12# Sync PBS datastore to B2 (encrypted) 13rclone sync /tank/backups b2-crypt:pbs-backups \ 14 --transfers 4 \ 15 --checkers 8 \ 16 --b2-hard-delete \ 17 --log-file $LOG \ 18 --log-level INFO \ 19 --stats 1m 20 21TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S') 22echo "[$TIMESTAMP] Off-site backup sync complete" >> $LOG 23 24# Send notification 25curl -s -o /dev/null -X POST "$SLACK_WEBHOOK" \ 26 -H "Content-Type: application/json" \ 27 -d "{\"text\": \"Off-site backup sync completed at $TIMESTAMP\"}"
Bash2 lines1# Cron job: every Sunday at 04:00 20 4 * * 0 /usr/local/bin/offsite-backup.sh
ZFS Maintenance
ZFS needs regular maintenance to stay healthy:
Scrubs
A scrub reads every block on every disk and verifies the checksums. If it finds a bad block, it automatically repairs it from the RAIDZ2 redundancy. I run scrubs weekly:
Bash6 lines1# /etc/cron.d/zfs-scrub 20 2 * * 0 root zpool scrub tank 3 4# Check scrub status 5root@nas:~# zpool status tank | grep -A 3 scan 6 scan: scrub repaired 0B in 08:42:15 with 0 errors on Sun Mar 9 10:42:15 2025
Eight hours for a 14TB scrub across 6 drives. Not fast, but it runs at 2am when nothing else is happening.
SMART Monitoring
Bash19 lines1# Install smartmontools 2apt install smartmontools 3 4# Enable SMART on all drives 5for d in /dev/sd{a..f}; do smartctl -s on $d; done 6 7# Check health 8root@nas:~# for d in /dev/sd{a..f}; do 9 echo "=== $d ===" 10 smartctl -H $d | grep "SMART overall" 11 smartctl -A $d | grep -E "(Reallocated|Current_Pending|Offline_Uncorrectable)" 12done 13 14=== /dev/sda === 15SMART overall-health self-assessment test result: PASSED 16 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 17197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0 18198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0 19# ... (same for all drives — all zeros. Healthy.)
The three SMART attributes I watch: Reallocated Sector Count, Current Pending Sector, and Offline Uncorrectable. If any of these go above zero, the drive is starting to fail and it's time to order a replacement.
I have a Prometheus exporter for SMART data and a Grafana alert that fires if any of these values increment. So far: zero alerts in a year of operation.
ZFS Snapshots
Beyond PBS backups, I use ZFS snapshots for quick local rollback:
Bash16 lines1# Automated snapshot schedule using zfs-auto-snapshot 2apt install zfs-auto-snapshot 3 4# Default schedule: 5# - Frequent (every 15 min): keep 4 6# - Hourly: keep 24 7# - Daily: keep 30 8# - Weekly: keep 8 9# - Monthly: keep 12 10 11root@nas:~# zfs list -t snapshot -o name,used,creation | head -10 12NAME USED CREATION 13tank/backups@zfs-auto-snap_hourly-... 128K Thu Mar 6 9:00 2025 14tank/backups@zfs-auto-snap_hourly-... 256K Thu Mar 6 10:00 2025 15tank/media@zfs-auto-snap_daily-... 0B Thu Mar 6 0:00 2025 16tank/docker-volumes@zfs-auto-snap... 4.2M Thu Mar 6 9:00 2025
Snapshots are nearly free in ZFS because of copy-on-write. A snapshot doesn't duplicate data — it just prevents the blocks from being freed when they're modified. The "USED" column shows how much data has changed since the snapshot was taken.
The Restore Test That Saved Me
A month after setting everything up, I decided to test a full restore. I picked a random VM (the Gitea instance), deleted it from Proxmox, and restored it from PBS.
Bash8 lines1# On the Proxmox web UI: 2# Datacenter → Storage → pbs-nas → Content → Select Gitea backup → Restore 3 4# Or via CLI: 5qmrestore pbs:backup/vm/102/2025-03-01T01:00:22Z 102 6 7# Time to restore a 32GB VM: 4 minutes 12 seconds 8# Gitea was back online with all repositories intact.
Four minutes from "VM deleted" to "fully running again." That's the difference between "I have backups" and "my backups actually work." If you take nothing else from this post: test your restores regularly.
I now test a random VM restore on the first of every month. It's in my calendar. Non-negotiable.
Network Shares
The NAS also serves as a general-purpose file server:
Bash19 lines1# /etc/exports (NFS for Proxmox and Linux clients) 2/tank/isos 10.10.10.0/24(ro,sync,no_subtree_check) 3/tank/docker-volumes 10.10.50.10/32(rw,sync,no_subtree_check,no_root_squash) 4 5# /etc/samba/smb.conf (SMB for the media library) 6[media] 7 path = /tank/media 8 browseable = yes 9 read only = yes 10 guest ok = yes 11 valid users = resham 12 13[timemachine] 14 path = /tank/timemachine 15 browseable = yes 16 read only = no 17 fruit:time machine = yes 18 fruit:time machine max size = 500G 19 vfs objects = catia fruit streams_xattr
The NFS share for ISOs is mounted on all Proxmox nodes, so I can install any OS on any VM without uploading the ISO to each node individually.
Power Consumption and Reliability
The NAS draws about 45W at idle (drives spinning) and 65W under heavy I/O (scrub or backup job). Monthly power cost: about $3.90 at $0.12/kWh.
Uptime as of today:
Bash2 lines1root@nas:~# uptime 2 10:23:45 up 247 days, 14:03, 2 users, load average: 0.42, 0.38, 0.35
247 days. Zero unplanned reboots. Zero disk failures. Zero data corruption events (verified by weekly scrubs). ZFS on Debian is boring in the best possible way.
What This Cost Me
| Item | Cost |
|---|---|
| Dell OptiPlex 7040 Tower | $85 |
| RAM upgrade (32GB ECC) | $25 |
| Dell H310 HBA (IT mode) | $25 |
| 6x 8TB WD Red Plus | $720 |
| SATA cables + caddy | $15 |
| Hardware total | $870 |
| Backblaze B2 (monthly) | ~$10.50 |
| Power (monthly) | ~$3.90 |
For $870 upfront and $14.40/month ongoing, I have:
- 32TB of usable, redundant, checksummed storage
- Automated daily backups of every VM in my homelab
- Off-site encrypted backups in the cloud
- Network file shares for the entire homelab
- Peace of mind that my data is safe
If you're running a homelab without proper backups — and I know some of you are, because I was one of you — please set this up. It doesn't have to be this exact configuration. A single external USB drive with a weekly rsync script is infinitely better than nothing. But if you have the budget, a dedicated ZFS NAS with PBS is one of the best investments you can make.
RAID is not a backup. Snapshots are not a backup. The only backup that counts is the one on a separate machine, preferably with a copy off-site, that you've tested restoring from. Build that, and you'll never have to rewrite 30 hours of lost work from memory again.