This article describes troubleshooting techniques for investigating situations where Vault unexpectedly begins returning "permission denied" in response to requests using Nomad server tokens.
Permission denied for
If your error message indicates permission denied to the
sys/leases/renew path, you are likely running into the issue described in the Nomad Vault Integration documentation's Permission Denied errors section; please review that documentation for the solution it provides.
Other Permission Denied errors
In this scenario, your Nomad server and jobs have been running healthfully with the Vault integration, and suddenly become unhealthy due to permission denied errors related to various Vault paths.
This is often due to the Nomad server token expiring. You can confirm whether or not the Nomad server token is expired by running
vault token lookup NOMAD_SERVER_TOKEN_HERE. If it returns "bad token", you'll know that it is expired.
The quickest way to recover is to generate a new Nomad server token and restart your Vault and Nomad servers. However, it's also important to understand the root cause of why your former Nomad server token expired.
The Nomad documentation on the Vault integration specifies that the Nomad server token should always be:
- An orphan
- A periodic token with no explicit max TTL
This ensures that the Nomad server token will never expire.
The most common reason why you would suddenly see a Nomad server token expire is if you did not make the Nomad server token both an orphan and a periodic token.
Using Audit Logs for Investigation
If you know the token ID of the Nomad server token that was returning "permission denied" errors, you can calculate its hash using this API:
Once obtained, you can search your audit logs for the hash to determine what happened, for example:
The audit log will show you the policies associated with the hashed token. If these policies do not match your expected policies, it's possible that you accidentally made the wrong token the Nomad server token. The next step would be to find the time when you created that token to see if it was created without configuring it as an orphan or periodic token.
If another user specifically revoked the Nomad server token, the audit log will include that action. However, if the Nomad server token was naturally revoked via TTL or its parent token expiring, the audit log will not include the revocation (since it came from Vault rather than via a user interaction). You will, however, still see when the "permission denied" errors began in the audit log.