Introduction
ACL is a sub-system running in Consul servers that authenticate requests and authorize access to Consul resources. The sub-system works by evaluating the ACL tokens provided by agents/services to determine if the request has permission to interact with the requested resource.
This article walks you through the general troubleshooting steps for ACL-related issues.
Prerequisites
- Before troubleshooting begins, make sure to review Consul release notes and
consul-k8s
release notes to see if there are any ACL bug fixes that have been released for your version and upgrade your Consul/Consul Helm chart version to the latest patch accordingly.
Table of Contents
- Identify the Error Message
- Identify the source of the error log
- Identify ACL token type
- Extract the token
- Troubleshoot ACL error message
- Next steps
- Additional Information
Instructions
-
Identify the Error Message
- Capture the error and confirm that it is an ACL-related issue. Below are possible ACL error messages keywords:
ACL not found
Cannot resolve root ACL
ACL support disabled
Permission denied
Invalid Parent
- Capture the error and confirm that it is an ACL-related issue. Below are possible ACL error messages keywords:
-
Identify the source of the error log
- Is it in an external application?
- Is it in the Consul client log?
- Is it in the Consul server log?
The below table lists possible requests that flow with ACL
Source Destination Example Log Location Application Consul Client Vault gets KV pair from Consul over HTTP API Vault server log Application Consul Server Pods in Consul Kubernetes such as connect-injector
Pod log Consul Client Consul Server Consul client performs service sync, retrieves KV data Consul Client log Consul Server Consul Server Consul Server Follower forwards call to Leader Consul Server log
-
Identify ACL token type
- Attempt to identify the type of ACL token used by looking at the error message's content
- Refer to the table below for more information.
Token Type Purpose Set by Explicit token The token that provided explicitly to API requests X-Consul-Token
header or?token=
query parameter. For Kubernetes, the token is generated with Auth Method loginDefault token The default token is the token that the agent will use for both internal agent operations and operations initiated by the HTTP and DNS interfaces when no specific token is provided Agent config Agent token Use for agent's internal operations. E.g. Catalog API, anti-entropy sync, retrieve leaf cert. Set to default token
if not configuredAgent config Anonymous token The built-in token that is used when no other token is set Not required Others Other special-purpose tokens such as agent_recovery
,initial_management
,replication
Agent config - Refer to the following flowchart for the precedence of ACL token type:
- Below are some example ACL error logs
Error Location ACL token type [WARN] agent: Coordinate update blocked by ACLs: accessorID="anonymous token"
Consul Server Anonymous token [ERROR] agent.client: RPC failed to server: method=Coordinate.Update server=<ip>:8300 error="rpc error making call: rpc error making call: ACL not found"
Consul Client Agent token [ERROR] agent.anti_entropy: failed to sync remote state: error="rpc error making call: ACL not found"
Consul Client Agent token [ERROR] Unable to read ACL token; retrying: err="Unexpected response code: 403 (ACL not found)"
Application (pod mesh-gateway
)Explicit token
-
Extract the token
- Once we have confirmed the location of the error log and the type of ACL token, we can extract the token for the next troubleshooting step. The location of the ACL token depends on the token type and deployment method. The following locations can be checked to retrieve the ACL token:
- Agent config file
- Under
<data_dir>/acl-tokens.json
when token persistence is enabled. This file will be used only when tokens are set withset-agent-token
command - Vault
consul
config - Nomad
consul
config - Environment variable CONSUL_HTTP_TOKEN
- Environment variable CONSUL_HTTP_TOKEN_FILE
- For Kubernetes, describe the pod
kubectl describe pod <pod>
and look for the above information
- Once we have confirmed the location of the error log and the type of ACL token, we can extract the token for the next troubleshooting step. The location of the ACL token depends on the token type and deployment method. The following locations can be checked to retrieve the ACL token:
-
Troubleshoot ACL error message
- Depending on the ACL error reported, troubleshoot as follows:
-
ACL not found
The Consul server is stating that the provided ACL token is not known. Possible reasons include:
- Incorrect token provided
- The token has not been replicated to Consul Server that the application is talking to
- The token was destroyed with /v1/acl/logout
To validate the correctness of the token, run the following commands from a Consul server with a privileged ACL token:export CONSUL_HTTP_TOKEN=<privileged_token> consul acl token read --accessor-id <accessor_id> // if error message has Accessor ID, OR consul acl token list -format json | jq '.[] | select(.SecretID == "<token>")'
If the token is not found, generate the required token and update the configuration accordingly.
Depending on the Consul config, one possible way to update the token is with this command:
consul acl set-agent-token <default|agent|recovery|replication> <new_token>
For Consul deployed in Kubernetes, try deleting the pod for the token to be regenerated.
-
Replication issues
There are 2 possible ACL replication processes:
- Raft Replication from Consul Server Leader to Followers when new token is created
- Replication of ACL token in multiple datacenters
For multiple datacenters ACL replication, use below HTTP API in secondary datacenter to check the replication status.
curl http://localhost:8500/v1/acl/replication?pretty { "Enabled": true, "Running": true, "SourceDatacenter": "primary_dc", "ReplicationType": "tokens", "ReplicatedIndex": 19, "ReplicatedTokenIndex": 22, "LastSuccess": "2019-05-09T18:54:09Z", "LastError": "0001-01-01T00:00:00Z" }
-
Permission denied
The Consul server is stating that the provided ACL token is known to the server, but does not have the required permission to access the resource.
Proceed by validating that all policies associated with the token are correct.
consul acl token read --accessor-id <accessor_id> --format json | jq '.Policies[].Name' consul acl policy read -name <policy_name> -format json | jq -r '.Rules'
If the policy provided is incorrect, update the policy with the correct permissions.
-
- Depending on the ACL error reported, troubleshoot as follows:
Next steps
If the issue persists after following the above steps, please contact HashiCorp support and provide us with all of the troubleshooting data you have collected, as well as the following additional information:
- Consul Client & Consul Server version
- Consul Helm Chart version (if any)
- ACL configuration block