HomeTechOps

NAS

TrueNAS snapshots and ZFS replication

ZFS snapshots are the native recovery layer for TrueNAS — instant to create, atomic per-dataset, and the right tool for accidental delete, ransomware in-place encryption, and pre-upgrade rollback. They're not a backup (they live on the same pool), but combined with Replication Tasks they become the off-box recovery layer that sits in front of cloud backup. This page walks through the snapshot + replication pair end-to-end.

Best for: TrueNAS Scale operators with a working ZFS pool who haven't set up periodic snapshot or replication tasks yet, and want a defendable recovery model before adding more workload.

ZFS hierarchy diagram + TrueNAS Scale UI

Reference images and diagrams. Click any image to view full resolution.

ZFS storage hierarchy diagram showing a pool of two RAIDZ1 vdevs plus an optional SLOG/L2ARC vdev, datasets like tank/Documents and tank/Media living in the pool, and snapshots taken per-dataset that can be replicated to a second ZFS host.
Original concept diagram (not vendor copyright). Snapshots are per-dataset point-in-time references; Replication Tasks copy them off the source pool to a second ZFS host with independent destination retention.
TrueNAS Scale Storage Dashboard showing pool 'tank' with a 1x MIRROR data vdev (2 wide / 20 GiB), pool status Online, ZFS Health Online, total ZFS errors 0, and scheduled scrub task set. Usable capacity 18.89 GiB, used 2.63 MiB.
Real screenshot captured 2026-05-18 from TrueNAS Scale 24.10.2 running in a VirtualBox VM. The Storage Dashboard shows pool topology (mirror), ZFS Health (Online / 0 errors), and Disk Health at a glance — pre-replacement triage starts here.
TrueNAS Scale Data Protection page with a periodic snapshot task on tank/Documents — schedule every hour, 14-day retention, state Pending. Also visible: TrueCloud Backup Tasks, Scrub Tasks, Cloud Sync Tasks, Rsync Tasks, and Replication Tasks panels.
Real screenshot: the Data Protection page shows a hourly periodic snapshot task on tank/Documents with 14-day retention. Replication Tasks panel sits next to it — the snapshot task feeds the replication task in the standard flow.
TrueNAS Scale Datasets page showing the tank pool with two child datasets (Documents and Media). Tank pool details: 3.05 MiB used / 18.89 GiB available, unencrypted, FILESYSTEM type, LZ4 compression. Documents and Media show 96 KiB each.
Real screenshot: per-dataset organization inside the tank pool. Each dataset can have its own compression, encryption, and snapshot policy — the Documents dataset has the hourly schedule, while Media keeps a longer daily cadence.

Snapshot vs replication vs cloud backup

  • **Snapshots** live on the source pool. Pool fails → snapshots gone. They protect against accidental delete, ransomware encrypting in place, and bad config changes you want to roll back. They don't protect against pool failure, hardware loss, or theft.
  • **Replication Tasks** copy snapshots to a second ZFS host. The destination keeps its own retention; you can have hourly snapshots on the source and weekly on the destination. This is the off-box layer that protects against pool loss.
  • **Cloud Sync / Cloud Backup** (covered in `/nas/truenas-scale-first-backup-setup`) writes to a non-ZFS destination — survives the loss of both TrueNAS hosts but doesn't preserve ZFS properties or snapshot history.
  • Layer all three for proper data protection: snapshots for instant recovery → replication for pool-loss survival → cloud backup for home-loss survival.

Configuring a Periodic Snapshot Task

  • Data Protection > Periodic Snapshot Tasks > Add. Pick the dataset(s); tick Recursive if you want children snapshotted too (typical for `/mnt/<pool>/data`).
  • Naming Schema: TrueNAS's default is fine (`auto-%Y-%m-%d_%H-%M`). The `auto-` prefix matters because Replication Tasks can match on it to know which snapshots to replicate vs ignore.
  • Snapshot Lifetime: pick a duration that matches the share's churn — 2 weeks for `Documents`, 1 month for media libraries, longer for archive datasets.
  • Schedule: Hourly for active edit shares; Daily for media; pick off-peak times to avoid IO contention.
  • Tick Allow taking empty snapshots if you want a snapshot taken even when there's been no change since the last one — useful for keeping a reliable timeline; off by default to save space.

Retention math (the part that fills pools)

  • Snapshot storage cost is proportional to the change rate, not the dataset size. A 10 TB media library that rarely changes uses almost no snapshot space; a 100 GB database dataset that churns hourly can fill a pool quickly.
  • Storage > pool > Datasets shows USED BY SNAPSHOTS per dataset. Watch this for the first week after enabling snapshots — if any dataset's snapshot-used is climbing toward 20% of the dataset's logical size, the retention is too aggressive for the churn.
  • OpenZFS docs recommend keeping pool occupancy below 80% for performance. Combined with snapshot growth, this means: leave headroom. Don't enable hourly snapshots with month-long retention on a pool that's already 70% full.
  • Pruning happens automatically based on Snapshot Lifetime + scheduled run. Don't delete snapshots manually unless you understand which ones are referenced by active Replication Tasks (deleting a referenced snapshot breaks replication).

Replication to a second ZFS host

  • Set up the destination first: another TrueNAS Scale instance (preferred) or any ZFS host with SSH access. Create the receiving pool/dataset.
  • On the source TrueNAS, Credentials > Backup Credentials > Add > SSH Connection. Either use semi-automatic setup (TrueNAS-to-TrueNAS, exchanges keys via the destination's API) or manual SSH key setup. Test the connection before continuing.
  • Data Protection > Replication Tasks > Add. Source dataset, destination SSH connection + destination dataset path. Set Replicate Specific Snapshots = `auto-%Y-%m-%d_%H-%M` (or matching your snapshot schema) so only intended snapshots replicate.
  • Encryption ON for over-internet replication (LAN-only can be off, but on is safer). Schedule replication after the snapshot schedule so each cycle replicates the latest.
  • Test manually: trigger snapshot task, then trigger replication task. On the destination, run `zfs list -t snapshot` on the receive dataset and confirm the new snapshot appears.

Snapshot-based recovery — the actual restore flow

  • Storage > pool > dataset > Snapshots. Find the snapshot from before the accidental change.
  • Two options: Clone (creates a writable clone you can mount and copy from) or Rollback (rolls the dataset back to the snapshot state — destroys newer snapshots; use carefully).
  • For self-service file recovery, share the `.zfs/snapshot/` hidden directory (or expose snapshots via SMB previous-versions if configured). Users browse old versions and copy individual files back.
  • Practice this once during calm weather: take a snapshot, modify a file, clone the snapshot, copy the file back from the clone, delete the clone. Five minutes total; saves you in a real incident.
Operator snapshotEvidence first
First proof

Pool is ONLINE with at least 20% free capacity.

Screen to open

Data Protection > Periodic Snapshot Tasks > Add > dataset > Recursive (if children matter) > Naming Schema > Snapshot Lifetime > Schedule

Expected signal

Storage > pool overview. ONLINE; capacity below 80%.

Stop boundary

Stop before treating replication as a replacement for Cloud Sync / Cloud Backup — destination still shares ZFS-side failure modes with source.

Layer path

1ZFS snapshots are TrueNAS's native point-in-time recovery layer — atomic per-dataset, near-instant to create, low storage cost initially, and the right tool for accidental delete, ransomware in-place encryption, and pre-upgrade rollback.
2Snapshots live on the source pool — pool loss takes the snapshots too. Replication Tasks are the off-box layer that protects against pool failure.
3Snapshot storage cost scales with change rate, not dataset size. High-churn datasets with aggressive retention can wedge a pool above 80% occupancy.
4Replication is asynchronous and incremental after the first sync; destination retention is independent of source retention.
Runbook

Step-by-step runbook

Start here. Do each check in order, compare it to the expected result, and stop when the evidence explains the failure or the safe stop point applies.

1

Verify pool is healthy and has capacity headroom

Check: Storage > pool > ONLINE; scrub recent and clean; capacity below 80%.

Expected result: Both conditions are true.

If not: Free space or fix pool BEFORE enabling aggressive snapshot retention.

2

Configure the first periodic snapshot task on one dataset

Check: Data Protection > Periodic Snapshot Tasks > Add. Pick the most-edited dataset. Hourly + 14-day lifetime; default naming schema.

Expected result: Task created; first snapshot appears after next schedule fires.

If not: Start with one dataset to learn capacity behavior before extending.

3

Watch USED BY SNAPSHOTS for a week

Check: Storage > pool > datasets table > USED BY SNAPSHOTS column. Note baseline; recheck daily.

Expected result: Growth is stable or slow relative to dataset size.

If not: If aggressive growth, trim Snapshot Lifetime; consider excluding noisy paths.

4

Set up replication to a second ZFS host

Check: Configure SSH Connection in Credentials > Backup Credentials. Create Replication Task pointing at the destination dataset. Match Replicate Specific Snapshots to the snapshot task's schema. Encrypt for over-internet paths.

Expected result: Replication runs successfully; destination shows replicated snapshots.

If not: Debug SSH path independently before troubleshooting the replication task itself.

Safe stop: Stop before treating replication as a replacement for Cloud Sync / Cloud Backup — destination still shares ZFS-side failure modes with source.

5

Practice a snapshot restore drill

Check: Take a snapshot, modify a known file, restore the file via Clone, verify content matches the snapshot version, delete the clone.

Expected result: Restore drill succeeds; you understand the Clone-then-Rollback flow.

If not: Practicing during calm weather prevents fumbling during real incidents.

6

Document retention policy and replication schedule

Check: External record (operations doc, calendar): snapshot schedule per dataset, retention windows, replication destination, last successful replication.

Expected result: Recurring monthly review on calendar; written record outside the NAS.

If not: Without documentation, retention drift accumulates silently.

Decision tree

Decision tree

If: Active edit dataset (Documents, homes, project folders).

Then: Hourly snapshots for 24 hours catch most accidental-delete cases.

Action: Data Protection > Periodic Snapshot Tasks > Add > hourly schedule + 14-day lifetime; layer a daily-for-a-month policy via a second task.

If: Low-churn dataset (media library, archives).

Then: Daily snapshots are sufficient; hourly would burn pool space without benefit.

Action: Daily schedule + 30-day lifetime.

If: Second ZFS host available for replication.

Then: Off-box protection against pool failure.

Action: Replication Task targeting the destination's SSH connection; encrypted; schedule after the snapshot task.

Safe stop: Stop before treating replication as Cloud-Backup-equivalent — destination is still ZFS-format and shares some failure modes with the source.

If: Pool occupancy already above 70% and growing.

Then: Snapshot retention risk is real.

Action: Conservative retention (daily snapshots only, shorter lifetime); or expand pool before enabling more aggressive snapshot policy.

Safe stop: Stop before enabling hourly snapshots on a pool above 75% — high probability of write-wedge during retention growth.

If: Need self-service file recovery for end users.

Then: ZFS snapshot directory `.zfs/snapshot/<snapname>/` is browsable per-dataset.

Action: Enable SMB previous-versions on the share, OR expose `.zfs/snapshot/` directly; users can browse old versions and copy files back.

Evidence

Evidence table

SymptomEvidence to collectLikely layerNext action
USED BY SNAPSHOTS climbing 5%+/week on a dataset.Storage > pool > datasets table USED BY SNAPSHOTS column.Retention too aggressive for change rateTrim Snapshot Lifetime; consider excluding noisy subdatasets; watch for app-data churn (databases, logs).
Replication task fails with 'no snapshots to replicate'.Replication Task Logs.Snapshot naming schema mismatchMatch the Replicate Specific Snapshots filter to the snapshot task's schema (defaults: `auto-%Y-%m-%d_%H-%M`).
Snapshot rollback destroyed snapshots newer than the rollback point.Storage > pool > dataset > Snapshots — fewer rows after rollback.Rollback semantics surpriseAlways Clone first (preserves the current state as a writable clone), then decide whether to Rollback. Documented behavior; not a bug.
Replication destination shows snapshots that no longer exist on source.On destination: `zfs list -t snapshot <dataset>`; on source: same command.Destination retention longer than sourceIntentional — destination's independent retention is the design. Adjust destination retention if it's growing too large; otherwise leave the longer history on the destination as the value-add.
Reference

Commands and settings paths

Create a Periodic Snapshot Task

Data Protection > Periodic Snapshot Tasks > Add > dataset > Recursive (if children matter) > Naming Schema > Snapshot Lifetime > Schedule

Where: In the TrueNAS web UI.

Expected: Task appears in the list; first snapshot appears under Storage > pool > dataset > Snapshots after the next schedule fires.

Failure means: If no snapshot appears, the schedule didn't fire or the dataset path is wrong.

Safe next step: Trigger Run Now; verify the snapshot appears in the dataset's snapshot list.

Restore a file from a snapshot via Clone

Storage > pool > dataset > Snapshots > the snapshot row > Clone to New Dataset. Mount the clone via SMB or browse in File Manager. Copy the file back to the live dataset.

Where: In the TrueNAS web UI.

Expected: File restores cleanly from the clone.

Failure means: If the clone path isn't reachable, check the clone's mount status.

Safe next step: Delete the clone after restore: Storage > pool > dataset > clone > Delete.

Rollback a dataset to a snapshot (with safety clone first)

1) Clone the current state: dataset > Clone to New Dataset (preserves current state). 2) Storage > dataset > Snapshots > target snapshot > Rollback. 3) Confirm rollback (destroys newer snapshots).

Where: In the TrueNAS web UI.

Expected: Dataset rolls back to the snapshot state; safety clone preserves the pre-rollback state.

Failure means: Failure to clone first means a wrong rollback is unrecoverable.

Safe next step: Always Clone before Rollback.

Verify a Replication Task end-to-end

Trigger snapshot task: Run Now. Then trigger replication task: Run Now. On destination: `zfs list -t snapshot <dest-dataset>` should show the new snapshot.

Where: In the TrueNAS web UI on source; CLI or web UI on destination.

Expected: New snapshot appears on destination with matching name.

Failure means: If not, replication path is broken — check SSH credentials, destination dataset path, and Replicate Specific Snapshots filter.

Safe next step: Debug each layer independently: SSH first, then snapshot match, then replication.

Hardware boundary

Hardware and platform boundary

Change only when

  • A second replication destination (off-site rather than just LAN) is the right next step only after on-LAN replication has been clean for a month.

Evidence that matters

  • Pool capacity headroom (20%+ free), retention matched to change rate, and replication-destination independence matter most.

Evidence that does not matter

  • More aggressive snapshot schedules don't help if pool is already capacity-tight; less aggressive doesn't help if active datasets lose hours of work between snapshots.

Avoid

  • Avoid treating snapshots as a substitute for Cloud Sync / Cloud Backup, enabling hourly snapshots on a near-full pool, or deleting snapshots manually that are referenced by active replication.

Last reviewed

2026-05-18 · Reviewed by HomeTechOps. Reviewed against TrueNAS Scale's Periodic Snapshot Tasks, Replication Tasks, and pool management documentation, plus OpenZFS upstream guidance on snapshot-stream replication semantics.

Source-backed checks

HomeTechOps turns official docs and conservative safety rules into a shorter runbook. These links are the source trail for the page direction.