The information contained in this article has been verified as up-to-date on the date of the original publication of the article. HashiCorp endeavors to keep this information up-to-date and correct, but it makes no representations or warranties of any kind, express or implied, about the ongoing completeness, accuracy, reliability, or suitability of the information provided.
All information contained in this article is for general information purposes only. Any reliance you place on such information as it applies to your use of your HashiCorp product is therefore strictly at your own risk.
Introduction
This document outlines the manual procedure for refreshing the Persistent Volume Claim (PVC) data directory for HashiCorp Consul server pods running on Kubernetes. This process can be necessary in certain recovery scenarios or when addressing persistent data corruption issues. The steps involve identifying Consul server pods, including the Raft leader, and sequentially deleting follower pods along with their associated PVCs, allowing Kubernetes to recreate them with fresh storage. Finally, the Raft leader's PVC and pod are refreshed.
Expected Outcome
-
Consul Server Pod Restarts: All Consul server pods in the specified namespace, including the Raft leader and followers, will be restarted.
-
Refreshed PVC Data Directories: The Persistent Volume Claims associated with each Consul server pod will be deleted, and upon pod recreation, new PVCs will be provisioned, effectively providing the Consul servers with fresh data directories.
-
Raft Cluster Rejoining: Each restarted Consul server pod will successfully rejoin the existing Raft consensus cluster. The
consul operator raft list-peers
command will show all the original server pod names as healthy members of the cluster, with one designated as the leader. -
Potential Resolution of Data Issues: If the procedure was performed to address data corruption or inconsistencies within the Consul server's persistent storage, the refreshed data directories should resolve these issues, allowing the Consul servers to operate correctly.
-
Minimal Downtime (Rolling Restart): By restarting follower nodes before the leader, this procedure aims to minimize disruption to the Consul service mesh. While a brief period of leader election might occur during the leader pod restart, the overall impact on service availability should be limited if the cluster was healthy initially.
It's important to note that while this process refreshes the persistent storage, it does not address underlying configuration issues or problems external to the Consul server's data directory. If problems persist after this procedure, further investigation into other aspects of the Consul deployment is necessary.
Prerequisites
-
Access to the Kubernetes Cluster: You must have
kubectl
configured and authorized to interact with the Kubernetes cluster where the Consul service mesh is deployed. This includes the ability to execute commands within namespaces and manage pods and PVCs. Crucially, ensure you have cluster administrator privileges to perform actions such as deleting pods and PVCs. -
Identification of the Consul Namespace: You need to know the specific Kubernetes namespace where your Consul servers are running. This namespace will be used in all
kubectl
commands. - Stable Kubernetes Cluster: The underlying Kubernetes cluster should be in a stable state with healthy nodes. Issues at the Kubernetes infrastructure level can complicate or hinder the successful execution of this procedure.
- Verify Consul Cluster Health: Confirm that the existing Consul cluster is running and all server nodes are reporting as healthy before initiating this refresh process. An unhealthy cluster can lead to further complications.
-
Familiarity with
kubectl
and Consul Helm Chart: You should be comfortable using commonkubectl
commands for managing Kubernetes resources (pods, PVCs, namespaces). If your Consul deployment was managed via the Helm chart, a general understanding of the chart's setup and configuration is also recommended. -
Backup Existing Consul Data (If Necessary): Depending on the reason for this refresh and your organization's data retention policies, consider backing up the existing Consul persistent volume data before proceeding. This provides a rollback option in case of unforeseen issues.
-
Confirm Recommended Consul Version: Ensure that your Consul deployment is running a recommended and stable version. Performing this procedure on outdated or unsupported versions might introduce additional risks.
Use Case
Refreshing the Persistent Volume Claim (PVC) data directory for Consul servers can be a necessary operation in specific scenarios to maintain the health and integrity of the Consul cluster. Here are some key use cases:
-
Addressing Persistent Data Corruption: If the underlying storage for the Consul server PVCs experiences corruption or file system errors that affect Consul's ability to operate correctly, refreshing the PVCs can provide a clean slate for Consul to initialize and rejoin the Raft cluster with consistent data.
-
Recovering from Data Inconsistencies: In situations where the Raft consensus group experiences severe data inconsistencies that cannot be resolved through normal Raft operations, refreshing the data directories of follower nodes (and potentially the leader with careful consideration) can help to establish a consistent state across the cluster.
-
Mitigating Storage-Related Issues: If there are known issues or limitations with the current persistent storage solution that are impacting Consul's stability or performance, refreshing the PVCs onto potentially different or newly configured storage can help mitigate these problems.
-
Testing Disaster Recovery Procedures: While not a routine operation, performing a PVC refresh in a controlled environment can be part of a larger disaster recovery testing strategy to validate the process of recovering Consul server state and ensuring the cluster can be rebuilt if underlying storage becomes unavailable.
-
Resolving Specific Failure Scenarios: Certain failure modes, such as prolonged network partitions or node failures leading to data divergence, might necessitate a PVC refresh as part of the recovery process to bring the Consul servers back into a healthy and consistent state.
-
Performing Maintenance or Upgrades Requiring Clean State: In rare cases, a significant Consul upgrade or a major change in the underlying infrastructure might require starting the Consul servers with a clean data directory to avoid potential conflicts or legacy data issues.
-
Troubleshooting Persistent Leadership Issues: If the Consul cluster experiences persistent issues with leader election or stability that are suspected to be related to the leader's persistent storage, refreshing the leader's PVC (with careful consideration and potential data backup) can be a troubleshooting step.
Important Considerations:
- Refreshing PVCs will result in data loss for the affected servers. Ensure you understand the implications and have appropriate backups if necessary.
- This procedure should be performed with caution and only when necessary, as it disrupts the Consul server nodes.
- Always monitor the Consul cluster health closely after performing a PVC refresh to ensure all servers rejoin the Raft group successfully.
By understanding these use cases, operators can make informed decisions about when and why refreshing Consul server PVC data directories might be a necessary step in maintaining a healthy and resilient Consul service mesh.
Procedure
Consider the following critical points while executing these steps:
- Avoid deleting PVCs for more than one server at a time to prevent losing quorum.
- Ensure each pod is fully functional before proceeding to the next.
- Monitor cluster health between steps to ensure stability.
Note: While these steps provide a manual method, an automated POSIX-compliant script is available here as a template for this task. Operators should exercise caution and thoroughly understand the script before utilizing it in a production environment, as its use is at their own risk.
Step 1: Verify Cluster Health
-
Check the health of the Consul servers:
kubectl exec -it -n <namespace> pod/consul-server-0 -c consul -- consul operator raft list-peers
-
Confirm all server nodes are healthy and identify the Raft leader:
RAFT_LEADER=$(kubectl exec -it -n <namespace> pod/consul-server-0 -c consul -- consul operator raft list-peers | awk '/leader/ {print $1}' | tr -d '\n')
-
Note the Raft leader for use in later steps.
Step 2: Retrieve Consul Server Pods
-
List all Consul server pods:
PODS=$(kubectl get pods -l app=consul,component=server -n <namespace> -o jsonpath='{.items[*].metadata.name}'
Step 3: Restart Pods with PVC Deletion
For each Follower (non-Leader) pod in the list:
-
Preemptively identify the PVC associated with the pod:
PVC=$(kubectl get pod "$POD" -n <namespace> -o jsonpath='{.spec.volumes[?(@.persistentVolumeClaim)].persistentVolumeClaim.claimName}')
-
Delete the pod to trigger termination:
kubectl delete pod "$POD" -n <namespace> &
-
Immediately delete the associated PVC:
kubectl delete pvc "$PVC" -n <namespace>
-
Wait for the pod to be recreated:
while ! kubectl get pod "$POD" -n <namespace> >/dev/null 2>&1; do sleep 5; done
-
Wait for the pod to become ready:
kubectl wait -n <namespace> --for=condition=ready pod "$POD" --timeout=30s
-
Verify the pod has rejoined the Raft cluster:
kubectl exec -it -n <namespace> pod/consul-server-0 -c consul -- consul operator raft list-peers | grep "$POD"
Step 4: Handle the Raft Leader Pod
-
Preemptively identify the PVC for the Raft leader pod:
RAFT_LEADER_PVC=$(kubectl get pod "$RAFT_LEADER" -n <namespace> -o jsonpath='{.spec.volumes[?(@.persistentVolumeClaim)].persistentVolumeClaim.claimName}')
-
Delete the Raft leader pod:
kubectl delete pod "$RAFT_LEADER" -n <namespace> &
-
Delete the associated PVC:
kubectl delete pvc "$RAFT_LEADER_PVC" -n <namespace>
-
Wait for the Raft leader pod to be recreated:
while ! kubectl get pod "$RAFT_LEADER" -n <namespace> >/dev/null 2>&1; do sleep 5; done
-
Wait for the pod to become ready:
kubectl wait -n <namespace> --for=condition=ready pod "$RAFT_LEADER" --timeout=30s
-
Verify the Raft leader has rejoined the cluster:
kubectl exec -it -n <namespace> pod/consul-server-0 -c consul -- consul operator raft list-peers | grep "$RAFT_LEADER"