How to recover from permanently lost quorum while using Raft integrated storage with Vault. – HashiCorp Help Center

DISCLAIMER: This article should be used only when the quorum in the Raft cluster is lost PERMANENTLY, and there is no way to recover enough nodes to reach quorum (elect a leader).

Introduction :

The integrated storage option (aka Raft) was introduced with Vault 1.4, more and more practitioners are adopting it as their main storage backend for Vault.

Until now, many practitioners have used Consul as their storage backend for Vault. Because of this, maintaining quorum within the cluster was a Consul task. With Integrated Storage within Vault, maintaining quorum must now be considered as part of your Vault environment.

The cluster quorum is updated dynamically, when more nodes are joined to the cluster. The quorum needed for a cluster to be able to perform read and write operations can be calculated by using the formula (n+1)/2, where n is the number of all nodes in the cluster. For example, if the number of all nodes in the cluster is 3, then (3+1)/2 = 2, so 2 nodes need to be operational in order for the cluster to function properly.

Note: There is an exception to this rule if the -non-voter option is used while joining to the cluster, it is only available in Vault Enterprise. Voter status can be checked by referencing the Voter column in the output of vault operator raft list-peers

Use Case :

The typical use case would be if in a three-node Vault Raft cluster, two of the nodes are permanently lost and there is no method of recovery, but the third node is in perfect working condition. One functioning node is not enough to reach the required quorum.

When a quorum is not reached (no leader in the cluster), no operations like reads and writes can be performed within the cluster, as at least two nodes in this three-node cluster need to be functioning in order to reach a quorum.

The names and statuses of the nodes in the cluster are reflected below :

vault1 - up and healthy
vault2 - down, not recoverable
vault3 - down, not recoverable

Procedure :

Login to the healthy node, in this case vault1.
Stop the Vault service on vault1. If you are using systemd you can execute systemctl stop vault. At this point the Vault service should be stopped on all nodes within the cluster, even if you have other healthy nodes.
Locate the Raft storage directory, this is defined with the configuration file (typically vault.hcl) which Vault is using. The stanza looks like this:
```
storage "raft" {
  path = "/opt/vault/data"
  node_id = "vault1"
}
```
We will follow the example above of /opt/vault/data.

Inside the data directory, you should see a folder named raft:

$ tree /opt/vault/data
/opt/vault/data
├── raft
│ ├── raft.db
│ └── snapshots
└── vault.db

2 directories, 2 files

Within the raft directory create a file named peers.json. In this example, the full file path would be /opt/vault/data/raft/peers.json:

$ tree /opt/vault/data
/opt/vault/data
├── raft
│ ├── raft.db
│ ├── peers.json
| └── snapshots
└── vault.db

2 directories, 3 files

Edit the file with the following content :
```
[
  {
    "id": "vault1",
    "address": "192.168.0.1:8201",
    "non_voter": false
  }
]
```
Update the id value so it matches the value for the *node_id parameter specified in the Vault configuration file, and set the address value to match the cluster_addr parameter in the Vault configuration file. This value must either be an IP address or an FQDN that is appended with the cluster port, by default :8201. The value entered here must be reachable from all other Vault nodes.

*Note that the node_id may also be specified in an automatically generated file called node_id under the raft directory.

As the recovery process will read and then delete the peers.json file we suggest making a copy of the peers.json file and saving it elsewhere on the filesystem, should another attempt at recovery be necessary.
Start the Vault service, if you are using systemd you can execute systemctl start vault. Sending the SIGHUP signal to the Vault process will not work.
Unseal Vault. If an auto-unseal method is being used this step is not necessary. Confirm that Vault is unsealed by running vault status and checking the value for Sealed is false.

If the procedure worked so far, you should see this message in the system logs while starting Vault :

2020-06-18T09:55:05.014Z [INFO]  storage.raft: raft recovery initiated: recovery_file=peers.json
2020-06-18T09:55:05.019Z [INFO]  storage.raft: raft recovery found new config: config="{[{Voter vault1    192.168.0.1:8201}]}"
2020-06-18T09:55:05.024Z [INFO]  storage.raft: raft recovery deleted peers.json

Running journalctl -u vault --no-pageron the server is a common method of obtaining these logs. The peers.json file should no longer be present at this point as Vault has consumed the content.

Check the output of vault status to confirm the value for HA Mode is active.
Now, you should have a cluster consisting of one node (vault1) that is the active/leader node, a quorum of one is reached, reads and writes are allowed to the storage. You can verify that there is only one node in the cluster with vault operator raft list-peers.
Once additional nodes are ready to be joined to the Vault cluster they can be joined using either their existing retry_join configuration or by using the vault operator raft join command.

Resources

DISCLAIMER: This article should be used only when the quorum in the Raft cluster is lost PERMANENTLY, and there is no way to recover enough nodes to reach quorum (elect a leader).

Introduction :

Use Case :

Procedure :

Resources

Articles in this section

Related articles