Backups & Storage

Snapshot retention: GFS and tiers

'Keep the last N snapshots' quietly fails: a burst of hourly snapshots can age out months of history in a day, and a long-dwell ransomware infection can push every clean snapshot out of the window. Tiered (GFS) retention keeps restore points across timescales so both 'I deleted a file an hour ago' and 'the corruption started months ago' are recoverable.

Who this is for

Home operators running NAS snapshots or a dedup backup tool who want a retention policy that survives both a recent mistake and a long-dwell corruption — instead of 'keep the last N snapshots', which silently loses history during snapshot bursts or long ransomware dwell times.

Outcome

A grandfather-father-son retention policy (daily/weekly/monthly, often plus yearly) expressed in your tools — restic forget keep-* flags, Synology Snapshot Replication Smart Retention, or TrueNAS snapshot lifetime plus an independent replication retention — sized to your change rate and free space, with dense recent history on the source and a long GFS ladder on the offsite copy.

Required inputs

Your snapshot/backup tool and where it runs (restic, Synology Snapshot Replication, TrueNAS periodic snapshot + replication tasks, etc.).
An estimate of your data's change rate (how much diverges per day) and the free space available on the pool/repository.
A target recovery profile: how recent you need fine-grained restore points and how far back you need any restore point at all.
A separate offsite/replicated target if you want the long tail of the GFS ladder to live off the box.

GuideFollow in order

Step-by-step procedure

Define the GFS tiers you need

Do: Decide daily (sons), weekly (fathers), monthly (grandfathers), and optionally yearly retention counts based on your recovery profile — e.g. 7 daily, 5 weekly, 12 monthly, a few yearly.

Expected result: You have explicit counts per tier rather than a single 'keep N' number.

If not: If you only have a fixed count, a snapshot burst or a long dwell can wipe your real history — convert to tiers.

Express the policy in your tool

Do: Set restic forget --keep-daily/-weekly/-monthly/-yearly, or Synology Snapshot Replication Smart Retention tiers, or TrueNAS snapshot lifetime — the tool keeps the most recent snapshot in each slot.

Expected result: Running the retention/forget operation results in snapshots spread across the tiers, not just the most recent ones.

If not: If everything older than a few days is gone after pruning, the tiered flags aren't applied — re-check the policy.

Split source vs offsite retention

Do: Keep short, dense history on the source (fast pool) and configure the replication/offsite target's retention independently for the long GFS ladder.

Expected result: The source holds recent restore points; the offsite/replicated copy holds the deep history.

If not: If the source carries the entire long ladder, it bloats the fast pool — move the long tail to the target.

Size it against change rate and free space

Do: Estimate snapshot growth from your change rate (not dataset size), and keep copy-on-write pools out of the near-full zone where fragmentation bites; trim the longest tiers if space is tight.

Expected result: Projected snapshot usage fits within free space with a planning buffer.

If not: If the pool trends toward full, reduce the deepest tiers or move them offsite before performance degrades.

Pair the deep history with immutability

Do: Put the long GFS tail on an immutable/offsite tier so a long-dwell attack can't delete the very restore points you'd need.

Expected result: The deep restore points live somewhere ransomware can't reach.

If not: If the long history is on the same writable box, treat that as the gap to close next.

Verify old restore points still restore

Do: Periodically restore from an older tier (not just the latest), since older versions are exactly what you need after a long-dwell problem.

Expected result: An older monthly/weekly restore point restores and opens correctly.

If not: If only the latest restores, your retention is effectively shallow — investigate the older tiers.

Commands and settings paths

Apply and inspect a restic GFS policy

restic forget --keep-daily 7 --keep-weekly 5 --keep-monthly 12 --keep-yearly 3 --dry-run

Where: On the backup host (dry-run first)

Expected: The plan keeps the most recent snapshot in each daily/weekly/monthly/yearly slot.

Failure means: If it would drop all but the last few, the keep-* flags aren't taking effect.

Safe next step: Fix the flags; run without --dry-run only once the plan looks right.

Check snapshot space vs free space

zfs list -t snapshot (TrueNAS) / Storage Manager → Volume usage (Synology)

Where: On the NAS

Expected: Snapshot usage is a tracked, bounded fraction of free space.

Failure means: Runaway snapshot growth means the change rate is higher than the retention assumes.

Safe next step: Trim the deepest tiers or move them offsite; keep the pool out of the near-full zone.

Restore from an older tier

Restore a file from a monthly/weekly restore point (not the latest) to a scratch folder.

Where: On a clean machine / scratch location

Expected: The older restore point restores and the file opens.

Failure means: If only the latest works, your effective retention is shallow.

Safe next step: Investigate why older tiers aren't usable before trusting the policy.

Evidence to record

The per-tier retention counts (daily/weekly/monthly/yearly) and where they're configured.
The split between source retention and offsite/replication retention.
Projected and actual snapshot space usage vs free space on the pool/repo.
The date and tier of the last successful older-restore-point test.

Common mistakes

Using 'keep the last N snapshots' so a burst of frequent snapshots evicts months of history.
Assuming snapshots are free — they grow with change rate and can fill a copy-on-write pool.
Keeping the entire long ladder on the source's fast pool instead of the offsite copy.
Never testing an older restore point, so a shallow-in-practice policy goes unnoticed.

Stop points

Stop adding deeper tiers once projected snapshot usage would push a copy-on-write pool toward full.
Stop trusting the policy until an older (not latest) restore point has actually been restored.

Last reviewed

2026-06-03

Source-backed checks

HomeTechOps turns official docs and conservative safety rules into a shorter runbook. These links are the source trail for the page direction.

restic docs: Removing snapshots (forget) and append-onlyUsed for the GFS keep-* flags (--keep-daily/-weekly/-monthly/-yearly), keeping the most-recent-in-slot, and the append-only repo + --keep-within pruning caveat.Synology KB: Smart Retention policy in Snapshot ReplicationUsed for the GFS-equivalent hourly/daily/weekly/monthly/yearly snapshot retention tiers (distinct from Hyper Backup's Smart Recycle).TrueNAS SCALE docs: Local replication and retentionUsed for the independent source Snapshot Lifetime vs replication-task retention (short on source, long GFS ladder on the target).Backblaze: 3-2-1 vs 3-2-1-1-0 vs 4-3-2Used for why 3-2-1 alone is insufficient in the ransomware era and what the extended rules add.

FAQ

Why is 'keep the last 30 snapshots' a bad policy?

Because the count, not the timespan, is fixed. If something triggers a burst of hourly snapshots, 30 of them might only cover a day — and you've lost weeks of history. And if ransomware dwells for a month before triggering, a short fixed count means every retained snapshot is already poisoned. GFS tiers (daily/weekly/monthly/yearly) guarantee coverage at each timescale instead. See restic's forget docs.

How much space will tiered snapshots use?

It tracks your change rate, not your total data — snapshots store the blocks that diverge from the snapshot, so a static media library costs almost nothing while a busy database or VM dataset costs a lot. Plan retention against free space, keep copy-on-write pools out of the near-full zone where fragmentation bites, and put the long tail of the GFS ladder on the replicated/offsite copy rather than the fast pool.

Should source and offsite copies have the same retention?

Usually not. Keep short, dense history on the source (fast recovery for recent mistakes) and a longer GFS ladder on the offsite/immutable copy (deep history for long-dwell problems). TrueNAS and Synology both let the replicated/target retention differ from the source. Pair this with an immutable or air-gapped copy so the deep history can't be deleted.