Infrastructure

Proxmox Backup Server: define a testable restore policy, not only a backup policy

Build a restore-first Proxmox Backup Server strategy with workload criticality, periodic tests, restore evidence, datastore monitoring and an operable recovery procedure.

22 May 2026 proxmoxproxmox-backup-serverrestorebackuprunbook

A Proxmox backup only has value if it can be restored when needed. Proxmox Backup Server provides deduplication, retention, verification and clean Proxmox VE integration, but it does not define a recovery strategy by itself. A policy that only says how many backups to keep is incomplete.

The scenario here is a small Proxmox VE cluster used for internal services: tooling controller, Linux application VMs, reverse proxy, monitoring, bastion and a few test environments. The goal is to build a policy that starts from expected restoration, then defines backup scheduling.

Classify workloads by expected restore

Not all VMs deserve the same backup frequency or restore test. A VM that can be rebuilt with Ansible does not have the same criticality as a service with local data. A test VM can tolerate more loss than an operations component.

text workload-classes.txt
Class A - Critical service with data
Examples: monitoring, operations tools, internal repository
Restore tested regularly
Longer retention

Class B - Important but rebuildable service
Examples: reverse proxy, bastion, standard Linux tools
Restore tested by sample
Configuration versioned as much as possible

Class C - Temporary environment
Examples: test VM, lab, sandbox
Short retention
Restore not a priority

This classification avoids treating the whole cluster as one block. It also documents choices. A VM without backup is not always a mistake if it is rebuildable and classified that way. A critical VM without restore testing is a real risk.

Define restore evidence

A restore test should not stop at booting a VM somewhere. It must prove that the expected service works. For a monitoring VM, that can mean UI access and recent data. For a reverse proxy, it can mean configuration validation and access to a test route.

text restore-evidence.txt
Restore test
VM: monitoring01
Source: backup from 2026-05-22 02:00
Target: isolated restore-test network
Validation: boot OK, service active, UI reachable, data present
Gap found: none
Decision: restore is usable

Evidence can be short, but it must be written. Without a trace, a restore test becomes an impression. With a trace, the team knows when restoration was last validated and what was actually tested.

Separate full restore and file recovery

Proxmox Backup Server can support several scenarios: restore a full VM, recover a disk, extract a file, or rebuild a service from a VM restored in an isolated network. The policy should state which scenarios are expected for each class.

text restore-scenarios.txt
Scenario 1: VM accidentally deleted
Restore the full VM from the last valid backup

Scenario 2: configuration file lost
Mount or browse the backup to recover the file

Scenario 3: application corruption
Restore in an isolated network
Extract data or compare with current state

Scenario 4: Proxmox node lost
Restore priority VMs on an available node

These scenarios guide tests. A backup that can restore a full VM but that nobody knows how to use for file extraction does not cover all operations needs.

Monitor the datastore as production

The PBS datastore becomes critical. Capacity, verification, pruning, garbage collection and errors must be visible. A silent backup policy can fail for a long time before anyone notices.

text pbs-monitoring.txt
Controls to monitor
Last backup per critical VM
Failed backup job
Verify task status
Datastore usage
Garbage collection duration
Disk or filesystem errors
Date of last restore test

The last restore test matters. Many dashboards show that backups finish. Fewer show that restoration is still proven.

Keep a minimal recovery procedure

During an incident, the team should not rebuild the process from memory. A short procedure is enough: where to connect, which datastore to use, how to choose the snapshot, where to restore, which network to apply, which validations to run and who decides to replace the original VM.

text restore-runbook.txt
Minimal procedure
1. Identify the VM and criticality class
2. Choose the backup to restore
3. Restore first into an isolated network if application state is uncertain
4. Validate boot, network, service and data
5. Decide replacement, extraction or abandonment
6. Document result and return-to-service time

This procedure should be tested by someone who did not write it. That is where gaps often appear: missing PBS access, no test network, insufficient storage or forgotten DNS dependency.

Conclusion

Proxmox Backup Server is a strong foundation, but the strategy must be restore-first. The useful question is not only how many backups are kept. It is which service can be restored, within a reasonable time, with which proof and through which procedure.

A healthy policy classifies workloads, defines scenarios, monitors the datastore, schedules tests and saves evidence. With that, PBS becomes an operable recovery base rather than a calendar of green backup jobs.