Introduction
When using Consul in a Kubernetes environment, the following message may appear in the logs:
failed to renew lease consul/consul-controller-lock: timed out waiting for the condition
Problem
The consul-connect-injector
pod was in a running
state but kept restarting. In the consul-connect-injector
logs, it gives the following error:
2023-07-10T21:16:41.745Z INFO controller.endpoints Starting Controller {"reconciler group": "", "reconciler kind": "Endpoints"}
2023-07-10T21:16:51.746Z ERROR error retrieving resource lock consul/consul-controller-lock: Get "https://10.0.0.1:443/apis/coordination.k8s.io/v1/namespaces/c
onsul/leases/consul-controller-lock": context deadline exceeded
k8s.io/client-go/tools/leaderelection.(*LeaderElector).renew.func1.1
/home/runner/go/pkg/mod/k8s.io/client-go@v0.22.2/tools/leaderelection/leaderelection.go:272
k8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.22.2/pkg/util/wait/wait.go:217
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.22.2/pkg/util/wait/wait.go:230
k8s.io/apimachinery/pkg/util/wait.poll
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.22.2/pkg/util/wait/wait.go:577
k8s.io/apimachinery/pkg/util/wait.PollImmediateUntilWithContext
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.22.2/pkg/util/wait/wait.go:542
k8s.io/apimachinery/pkg/util/wait.PollImmediateUntil
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.22.2/pkg/util/wait/wait.go:533
k8s.io/client-go/tools/leaderelection.(*LeaderElector).renew.func1
/home/runner/go/pkg/mod/k8s.io/client-go@v0.22.2/tools/leaderelection/leaderelection.go:271
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.22.2/pkg/util/wait/wait.go:155
k8s.io/apimachinery/pkg/util/wait.BackoffUntil
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.22.2/pkg/util/wait/wait.go:156
k8s.io/apimachinery/pkg/util/wait.JitterUntil
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.22.2/pkg/util/wait/wait.go:133
k8s.io/apimachinery/pkg/util/wait.Until
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.22.2/pkg/util/wait/wait.go:90
k8s.io/client-go/tools/leaderelection.(*LeaderElector).renew
/home/runner/go/pkg/mod/k8s.io/client-go@v0.22.2/tools/leaderelection/leaderelection.go:268
k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run
/home/runner/go/pkg/mod/k8s.io/client-go@v0.22.2/tools/leaderelection/leaderelection.go:212
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startLeaderElection.func3
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.2/pkg/manager/internal.go:681
2023-07-10T21:16:51.746Z INFO failed to renew lease consul/consul-controller-lock: timed out waiting for the condition
2023-07-10T21:16:51.746Z ERROR setup problem running manager {"error": "leader election lost"}
main.main
/home/runner/work/consul-k8s/consul-k8s/control-plane/main.go:17
runtime.main
/opt/hostedtoolcache/go/1.18.3/x64/src/runtime/proc.go:250
Cause
- In the following error message, it is showing the
consul-connect-injector
pod communicating withK8S API Server
to create a Controller Manager.
2023-07-10T21:16:51.746Z ERROR error retrieving resource lock consul/consul-controller-lock: Get "https://10.0.0.1:443/apis/coordination.k8s.io/v1/namespaces/c
onsul/leases/consul-controller-lock": context deadline exceeded - Part of this process includes acquiring a leader lock so that there will be only one active controller manager.
-
The error
context deadline exceeded
indicates that the request to the K8s API server timed out, which could be due to a network connectivity issue between the pod and the K8s API server.
Overview of possible solutions
Solutions:
- Check with your networking team to ensure no firewalls or other security is blocking communication
- Check with your cloud platform (AWS, Azure, GCP) support team for further assistance
Outcome
Once the issue is fixed, the consul-connect-injector
pod should be up and running without any restarts.