Infrastructure
Proxmox cluster design before you call it production
A runbook-style article on Proxmox VE cluster design, quorum, storage, networking, backups, and the checks that matter before presenting a cluster as a production platform.
Installing Proxmox VE is easy. Calling the result a production platform is the part that needs discipline. The UI makes cluster creation look quick, but quorum, storage, network design, backups, and operational recovery decide whether the platform remains stable after the first incident.
This article is not a one node lab tutorial. The target is a small but credible Proxmox VE cluster design that can survive normal maintenance, a node reboot, and predictable operator mistakes.
Scope used in the examples
Nodes: pve01, pve02, pve03
Management: 10.20.0.0/24
Cluster comm: 10.21.0.0/24
VM traffic: VLAN backed bridges
Storage: shared storage or planned local-only model
Backups: Proxmox Backup Server or equivalent controlled target Three nodes is the sane minimum for a cluster that relies on quorum. Proxmox uses Corosync for cluster communication and quorum is part of the design, not an optional add on. The Proxmox documentation is explicit that cluster behavior and possible node count depend on the communication design and host performance.
Start with host level checks
Before clustering anything, verify that each node is healthy enough to avoid blaming the cluster for a host problem.
hostnamectl
ip -br a
pveversion -v
pveperf
chronyc sources -v
lsblk
cat /etc/network/interfaces pveperf is not a cosmetic command. It gives a quick signal on storage latency and CPU behavior before VMs start competing for resources.
Build the first node cleanly
Create the cluster on a single node first.
CLUSTER_NAME="pve-prod-01"
CLUSTER_IP="10.21.0.11"
pvecm create $CLUSTER_NAME --bindnet0_addr $CLUSTER_IP Then confirm the cluster state and Corosync view.
pvecm status
corosync-cfgtool -s
journalctl -u corosync --no-pager -n 50 If these checks already look unstable on the first node, stop there. Adding nodes will only hide the real issue for a while.
Join the second and third nodes
On each additional node, use the cluster IP that belongs to the first node.
pvecm add 10.21.0.11 Validate after each join instead of adding everything in one sequence.
pvecm nodes
pvecm status
journalctl -u corosync --no-pager -n 100 Separate management, cluster traffic, and guest traffic when it matters
One flat network can work in a lab. It becomes harder to reason about under load, especially during storage replication, backup windows, and node recovery. The Proxmox documentation discusses separate cluster networks and redundancy as part of cluster design.
A minimal host network example:
auto lo
iface lo inet loopback
auto eno1
iface eno1 inet manual
auto eno2
iface eno2 inet manual
auto vmbr0
iface vmbr0 inet static
address 10.20.0.11/24
gateway 10.20.0.1
bridge-ports eno1
bridge-stp off
bridge-fd 0
auto vmbr1
iface vmbr1 inet static
address 10.21.0.11/24
bridge-ports eno2
bridge-stp off
bridge-fd 0
mtu 9000 This is not the only valid design, but it is easier to troubleshoot than mixing everything on one interface and hoping VLAN separation will remain obvious to every operator.
Decide your storage model before creating important VMs
This is where many clusters become misleading.
A Proxmox cluster does not magically provide shared storage. You still need a deliberate model:
shared storage for mobility and HA
or
local storage with clearly accepted operational limits
Quick inventory:
pvesm status
cat /etc/pve/storage.cfg If you use local storage only, say it clearly. Live migration, HA expectations, and recovery procedures will not look the same as on shared storage.
Example storage configuration snippet for an NFS target:
dir: local
path /var/lib/vz
content iso,vztmpl,backup
nfs: nfs-prod-01
export /srv/pve
path /mnt/pve/nfs-prod-01
server 10.20.30.50
content images,iso,backup,vztmpl
options vers=4.1 Backups are part of the platform, not an afterthought
A cluster without an explicit backup and restore path is still a lab. Proxmox documents backup and restore separately because it is a first class operational concern, not an accessory.
Basic scheduled backup example:
pvesh create /cluster/backup --storage pbs-prod-01 --mode snapshot --compress zstd --node pve01 --schedule 'mon..sat 23:00' --all 1 Validation is just as important as scheduling.
pvesh get /cluster/backup
vzdump --dumpdir /tmp/vzdump-test --mode snapshot 101
qmrestore /tmp/vzdump-test/vzdump-qemu-101-*.vma.zst 901 --storage nfs-prod-01 --unique 1 If restore has never been tested, backup is still an assumption.
HA is not a checkbox
Do not present a cluster as highly available only because the HA panel exists. HA requires a consistent quorum model, storage decisions that support restart or relocation, and service dependency awareness. The Proxmox documentation separates HA, storage replication, and cluster design for a reason.
Useful checks before enabling HA on a VM:
ha-manager status
pvesh get /cluster/resources --type vm
pvecm status
pvesm status Failure patterns that appear quickly in real environments
Two node clusters presented as production
Two nodes can work for specific constrained use cases, but quorum and failure handling need extra thought. Many teams build a two node cluster then act surprised when maintenance becomes awkward.
Storage capability assumed from cluster membership
Cluster membership does not turn local disks into shared storage.
One noisy network for everything
If management, cluster communication, replication, and guest traffic all share one path, diagnosis under stress becomes harder than it needs to be.
Backup target reachable but restore never tested
This is one of the most common false positives in virtualization operations.
Overusing local storage with migration expectations
Local storage is not wrong. It is wrong only when the operational model pretends it behaves like shared storage.
Validation runbook before calling it production
pveversion -v
pvecm status
pvecm nodes
corosync-cfgtool -s
pvesm status
ha-manager status
pvesh get /nodes
pvesh get /cluster/resources
pvesh get /cluster/backup
journalctl -u corosync --no-pager -n 100
journalctl -u pvedaemon --no-pager -n 100 Then test the things that operators usually skip.
Live migration or expected equivalent behavior.
A controlled node reboot.
A restore into a non production VM ID.
A backup window with normal traffic still present.
Design rules worth keeping
Three nodes is the sensible baseline when quorum matters.
Storage model must be explicit before service expectations are discussed.
Backups and restore testing belong in the platform build phase.
Network separation is not mandatory everywhere, but ambiguity is expensive.
If the platform is really a lab, call it a lab. That makes the design honest and the expectations manageable.