Introduction:
Vault requires a quorum of unsealed nodes to operate in High Availability (HA) mode. In Kubernetes, Vault is often deployed using the Helm chart with integrated storage like Raft. If multiple pods go down or restart, Vault may lose quorum, preventing leadership election and effectively bringing the service to a halt.
This article walks you through the steps to recover Vault quorum in Kubernetes when running with Raft Integrated Storage.
Prerequisites:
- Administrative access to your Kubernetes cluster (kubectl configured).
- Access to the Vault Helm deployment (helm list, kubectl get pods).
- Access to at least one previously unsealed Vault pod.
- Vault initialized with Raft integrated storage.
- Required number Vault unseal keys (in case no auto-unseal mechanism is configured for vault configuration).
Steps to recover:
1: Set the replica count to 1.
kubectl scale statefulsets -n vault vault --replicas=1
2: Check the kubectl get pod output to get the right pod name.
kubectl exec -ti -n vault vault-0 -- /bin/sh
3: Find the storage directory path from cat /tmp/storageconfig.hcl or by reading the config map of vault.
4: Find the node id before creating peers.json file. In statefulset if SetNodeid
is true
, then take the value from env variable VAULT_RAFT_NODE_ID
. If setNodeId
is set to false
, then cat /vault/data/node-id
file. Make sure to put the same value in the peers.json file.
5: Create peers.json file in raft storage path. Check pod env variable to get the cluster address.
cd /storage/data/raft vi peers.json [ { "id": "<node_id>", "address": "<VAULT_CLUSTER_ADDR>:8201", "non_voter": false } ]
6: Save the file and exit the pod. And then recreate the pod. Pod will be scheduled on any available node. Once pod state is running, exec into the pod for validation.
7: Once step 6 is completed and the vault pod is active state, increase the replicaset to the desired number of replicas.
kubectl scale statefulsets -n vault vault --replicas=< Desired number of pods>
8: Exec into each new pod and delete raft data and vault.db. Once done, recreate the pod. Considering the vault config has retry_join
stanza, it will join the leader node.