The information contained in this article has been verified as up-to-date on the date of the original publication of the article. HashiCorp endeavors to keep this information up-to-date and correct, but it makes no representations or warranties of any kind, express or implied, about the ongoing completeness, accuracy, reliability, or suitability of the information provided.
All information contained in this article is for general information purposes only. Any reliance you place on such information as it applies to your use of your HashiCorp product is therefore strictly at your own risk.
Introduction
This article will guide you through debugging Consul Connect on Kubernetes when you encounter the error "403 ACL not found Injector Issue." This issue often arises when Consul Connect is not properly configured or when there are mismatches between ACL permissions and the injected sidecars.
We will cover the following topics in this article:
- Understanding the "403 ACL not found Injector Issue"
- Identifying the cause of the issue
- Troubleshooting steps to resolve the issue
By following the troubleshooting steps in this article, you can identify and resolve the cause of the "403 ACL not found Injector Issue" and ensure that Consul Connect is working as expected.
Problem
This document describes a scenario where application pods in a Consul Kubernetes environment fail to start, encountering a CrashLoopBackOff
error. The issue stems from the consul-connect-inject-init
container's inability to find registered services. This failure is further traced to the consul-connect-injector
pod, which logs "403 ACL not found" errors when attempting to register the service and its health checks with Consul.
Key Error Indicators:
-
consul-connect-inject-init
logs: Repeated "Unable to find registered services" messages indicate that the init container cannot locate the necessary services for the application pod to start.
2023-07-12T05:22:33.744Z [INFO] Unable to find registered services; retrying 2023-07-12T05:22:34.748Z [INFO] Unable to find registered services; retrying 2023-07-12T05:22:35.754Z [INFO] Unable to find registered services; retrying 2023-07-12T05:22:36.759Z [INFO] Unable to find registered services; retrying 2023-07-12T05:22:37.764Z [INFO] Unable to find registered services; retrying 2023-07-12T05:22:38.768Z [INFO] Unable to find registered services; retrying 2023-07-12T05:22:39.789Z [INFO] Unable to find registered services; retrying 2023-07-12T05:22:40.797Z [INFO] Unable to find registered services; retrying 2023-07-12T05:22:41.814Z [INFO] Unable to find registered services; retrying 2023-07-12T05:22:42.818Z [INFO] Unable to find registered services; retrying 2023-07-12T05:22:42.818Z [INFO] Check to ensure a Kubernetes service has been created for this application. If your pod is not starting also check the connect-inject deployment logs.
-
consul-connect-injector
logs: "Unexpected response code: 403 (ACL not found)" errors reveal that the injector is encountering permission issues when interacting with the Consul API. This suggests that the injector lacks the necessary ACL tokens to register services and health checks.
2023-07-12T06:16:05.815Z INFO controller.endpoints registering service with Consul {"name": "<service>", "id": "<service>-6ddbc5d9b-lt69p-<service>", "agentIP": "1.2.3.4"} 2023-07-12T06:16:05.819Z ERROR controller.endpoints failed to register service {"name": "<service>", "error": "Unexpected response code: 403 (ACL not found)"} ... 2023-07-12T06:16:05.819Z ERROR controller.endpoints failed to register services or health check {"name": "<service>", "ns": "default", "error": "Unexpected response code: 403 (ACL not found)"} ... 2023-07-12T06:16:05.908Z ERROR controller.endpoints failed to get service instances {"name": "<service>", "error": "Unexpected response code: 403 (ACL not found)"} ... 2023-07-12T06:16:05.908Z ERROR controller.endpoints failed to deregister endpoints on all agents {"name": "<service>", "ns": "default", "error": "Unexpected response code: 403 (ACL not found)"}
These errors collectively point towards an ACL configuration problem preventing the successful registration of services, ultimately leading to application pod startup failures.
Cause
The "403 ACL not found" error in the consul-connect-injector
logs may stem from an issue where the injector attempts to register services using an invalid or non-existent ACL token. This can be caused by a bug affecting consul-k8s
releases before v0.49.5 ([0.49.x] remove livenessProbe from pods with preStop lifecycle hooks that delete ACL tokens #1914).
Here's how this bug impacts the service registration process:
-
Service Registration Dependency: The
consul-connect-inject-init
container in application pods waits for two services—application
andsidecar-proxy
—to be registered with Consul before proceeding. -
Injector Responsibility: The
consul-connect-injector
pod is responsible for registering these services. -
Liveness Probe Failure and preStop Hook: If the
consul-connect-injector
's liveness probe fails, it triggers apreStop
hook that executesconsul logout
. This command inadvertently deletes the ACL token associated with the injector pod. -
Invalid Token: Consequently, subsequent attempts by the injector to reconcile and register the
application
andsidecar-proxy
services fail with a "403 ACL not found" error, as it's now using an invalidated token.
This chain of events ultimately prevents application pods from starting successfully, as the required services cannot be registered with Consul due to the ACL token deletion.
Solution
Upgrade consul-k8s
to a release ≥ v0.49.5
. The bug fix changelog is:
control-plane: fix issue where consul-connect-injector acl token was unintentionally being deleted and not recreated when a container was restarted due to a livenessProbe failure. [GH-1914]
- If you are unable to upgrade
consul-k8s
, apply the below immediate fix by restartingconsul-connect-injector
deployment
kubectl -n <namespace> rollout restart deployment consul-connect-injector
Verification
- After the restart, the errors should no longer appear in the
consul-connect-injector
pod's logs like in the below example:
2023-07-12T07:19:51.456Z INFO controller.endpoints registering service with Consul {"name": "<service>", "id": "<service-id>", "agentIP": "1.2.3.4"}
2023-07-12T07:19:51.459Z INFO controller.endpoints registering proxy service with Consul {"name": "<serivce>-sidecar-proxy"}
- The application pods should show that it started successfully and the
consul-connect-inject-init
logs should show the following lines:
2023-07-12T07:19:28.965Z [INFO] Consul login complete
2023-07-12T07:19:28.966Z [INFO] Checking that the ACL token exists when reading it in the stale consistency mode
2023-07-12T07:19:28.969Z [INFO] Successfully read ACL token from the server
2023-07-12T07:19:28.972Z [INFO] Registered service has been detected: service=<service>-sidecar-proxy
2023-07-12T07:19:28.972Z [INFO] Registered service has been detected: service=<service>
2023-07-12T07:19:28.973Z [INFO] Connect initialization completed
Additional Information
- CHANGELOG 0.49.5 (March 9, 2023)
- [0.49.x] remove livenessProbe from pods with preStop lifecycle hooks that delete ACL tokens #1914