Introduction
This article provides a resolution for Consul Enterprise server pods failing to start during a Helm chart upgrade due to an unsupported locality configuration key. This issue occurs when using Helm chart versions that inject locality configuration into Consul versions that don't support it.
Problem
When upgrading the Consul Helm chart to version 1.2.0 or later while running Consul Enterprise versions prior to 1.17, the Consul server pods fail to start and enter a CrashLoopBackOff state. The pod logs show the following error:
==> failed to parse /consul/extra-config/locality.json: 1 error occurred:
* invalid config key localityEnvironment
This issue occurs in the following environment:
- Kubernetes: Any Kubernetes distribution (GKE, EKS, AKS, etc.)
- Consul Version: Enterprise 1.15.x or 1.16.x (any version prior to 1.17)
- Helm Chart Version: 1.2.0 or later
-
Key Requirement: Kubernetes nodes have topology labels (
topology.kubernetes.io/regionortopology.kubernetes.io/zone)
Cause
The root cause is a feature mismatch between the Helm chart and the Consul binary version:
What happens:
- Helm chart version 1.2.0+ unconditionally injects an init container named
locality-initinto the Consul server pods - The
locality-initcontainer runs the command:consul-k8s-control-plane fetch-server-region -node-name "$NODE_NAME" -output-file /consul/extra-config/locality.json - This command reads Kubernetes node topology labels and generates a locality configuration file
- Consul Enterprise versions prior to 1.17 don't recognize the
localityconfiguration key (feature added in Consul 1.17) - The Consul server process fails on startup when trying to parse the invalid configuration
Why it appears in some environments but not others:
-
Production environments (GKE, EKS, AKS): Cloud providers automatically add topology labels to nodes (region/zone information), causing
locality.jsonto be populated with configuration data -
Development/test environments: Local clusters often lack topology labels, resulting in an empty or minimal
locality.jsonfile that doesn't trigger the error
Important Note: The customer in the original incident reported upgrading to Helm chart version 1.1.18, but source code analysis confirms that locality-init was not introduced until version 1.2.0. This suggests the actual chart version used was 1.2.x or later.
Solutions
There are two reliable approaches to resolve this issue. Choose the option that best fits your environment and requirements.
Option 1: Upgrade Consul to Version 1.17+ (Recommended)
The proper long-term solution is to upgrade to a Consul version that supports the locality feature.
NOTE: It is best to upgrade to 1.21+ent, as that would provide long term support:
Steps:
-
Update your Helm values file to use Consul Enterprise 1.17.0 or later:
global: image: hashicorp/consul-enterprise:1.17.0-ent -
Apply the upgrade following HashiCorp's recommended upgrade procedures for Consul on Kubernetes:
# Note: Replace '<consul-values>.yaml' with the actual path to your values file helm upgrade consul hashicorp/consul \ --version 1.2.0 \ --namespace consul \ --values <consul-values>.yaml -
Verify the upgrade:
kubectl get pods -n consul kubectl exec -n consul consul-server-0 -- consul version
Advantages:
- Enables use of locality-aware routing features
- Future-proof solution that aligns chart capabilities with Consul features
- No functionality loss
Option 2: Downgrade Helm Chart to 1.1.x
If upgrading Consul is not immediately feasible, downgrade to a Helm chart version without locality-init.
Prerequisites:
-
Ensure the HashiCorp Helm repository is added:
helm repo add hashicorp https://helm.releases.hashicorp.com helm repo update
Steps:
-
Identify the current chart version:
# Note: If running inside a K3s VM, set KUBECONFIG first: # export KUBECONFIG=/etc/rancher/k3s/k3s.yaml helm list -n consul -
Downgrade to Helm chart version 1.1.18 or earlier:
# Note: Replace '<consul-values>.yaml' with the actual path to your values file # For K3s reproduction environment, use: /tmp/k3s-consul-test/consul-values-reproduce.yaml helm upgrade consul hashicorp/consul \ --version 1.1.18 \ --namespace consul \ --values <consul-values>.yaml -
Delete existing pods to force recreation with the downgraded chart:
# Delete server pods kubectl delete pod -n consul -l component=server # Delete client pods kubectl delete pod -n consul -l component=client
Advantages:
- Immediate resolution without changing Consul version
- No configuration changes needed
- Compatible with existing Consul 1.15.x/1.16.x deployments
Disadvantages:
- Misses newer Helm chart features and bug fixes
- Not a long-term solution
Manual Fix Procedure
If you need to manually fix a failed deployment:
-
Verify the issue:
# Check pod status kubectl get pods -n consul # Check for locality-init container (should show "locality-init") kubectl get pod consul-server-0 -n consul -o jsonpath='{.spec.initContainers[*].name}'; echo # View error logs kubectl logs consul-server-0 -n consul -
Check your versions:
# Note: If running inside a K3s VM, set KUBECONFIG first: # export KUBECONFIG=/etc/rancher/k3s/k3s.yaml # Helm chart version helm list -n consul # Consul version in the pod spec kubectl get pod consul-server-0 -n consul -o jsonpath='{.spec.containers[?(@.name=="consul")].image}'; echo -
Verify node topology labels:
# Check if nodes have topology labels (may show nothing if labels not present) kubectl get nodes --show-labels | grep topology # If no output, labels are not present. The locality-init container may still # create locality.json causing the error. This is the root cause of the issue. Apply the appropriate solution from the options above
-
Force pod recreation:
# Delete the server pods (StatefulSet will recreate them) kubectl delete pod -n consul -l component=server # Delete the client pods (DaemonSet will recreate them) kubectl delete pod -n consul -l component=client # Or for GitOps tools like ArgoCD, force a sync: argocd app sync consul --force -
Verify the fix:
# Pods should be running kubectl get pods -n consul # Check if locality-init is gone (Option 2) or if Consul accepts the config (Option 1) kubectl get pod consul-server-0 -n consul -o jsonpath='{.spec.initContainers[*].name}'; echo # Verify Consul is healthy kubectl exec -n consul consul-server-0 -- consul members
Outcome
After applying Option 1 or Option 2:
- The
locality-initcontainer is either removed (downgrade option) or Consul successfully processes the locality configuration (upgrade option) - Consul server pods start successfully and reach Running state
- The cluster becomes healthy and operational
- The Consul upgrade can be completed
Additional Information
Version Compatibility Matrix
| Consul Version | Helm Chart Version | locality-init | Supports locality config |
|---|---|---|---|
| 1.15.x | 1.1.x | ❌ Not present | ❌ No |
| 1.15.x | 1.2.0+ | ✅ Injected | ❌ No (ERROR) |
| 1.17.0+ | 1.2.0+ | ✅ Injected | ✅ Yes |
Key Technical Details
Chart Version Clarification: The
locality-initinit container was introduced in Helm chart version 1.2.0, not 1.1.18 as originally reported. Source code analysis of the consul-k8s repository confirms this timeline.Unconditional Injection: In Helm chart 1.2.0, the
locality-initcontainer is unconditionally injected into server pods. It is not controlled by theglobal.federation.enabledsetting or any other configuration option.-
Topology Label Dependency: The severity of this issue depends on the presence of Kubernetes node topology labels:
-
With labels (
topology.kubernetes.io/regionortopology.kubernetes.io/zone):locality.jsonis populated with configuration data → Consul fails -
Without labels:
locality.jsonis empty or minimal → no error occurs
-
With labels (
Cloud Provider Behavior: Major cloud providers (GKE, EKS, AKS) automatically add topology labels to nodes, which is why this issue commonly appears in production environments but not in local development clusters.
StatefulSet Reconciliation: GitOps tools like ArgoCD may not automatically apply changes to a StatefulSet's pod template. A forced sync or manual pod deletion is often required to ensure the new configuration takes effect.
Federation vs Locality: Despite customer reports suggesting federation configuration was the trigger, source code analysis shows no direct relationship between
global.federation.enabledandlocality-initinjection in chart 1.2.0. The customer's fix (disabling federation) may have coincided with other changes or a chart version difference.
Preventive Measures
- Pre-upgrade Testing: Always test Helm chart upgrades in a non-production environment that mirrors production topology (including node labels)
- Version Alignment: Ensure Consul binary version capabilities align with Helm chart features before upgrading
- Monitoring: Implement alerts for pods in CrashLoopBackOff state to catch issues quickly
- Documentation: Maintain clear documentation of your Consul and Helm chart versions across all environments
Center