Vault Operational logs provide visibility into lease revocations, errors and much more in terms of key insights into system activities. When these logs are stored in JSON format, it makes them amenable to querying tools like jq, combined with standard Unix utilities like grep, cut, sort, uniq, and awk.
The examples below demonstrate practical ways to extract trends in lease revocations or error patterns, which can help in troubleshooting performance issues, identifying peak usage times, or generally detecting anomalies in Vault operations.
When this is the anatomy of Vault operational logs at hand, saved in file vault-output.log:
{"@level":"error","@message":"lease renewal failed","@module":"secrets.system.system_abcdef12","@timestamp":"2025-08-01T00:00:00.000000Z","error":"lease expired","lease_id":"database/example/db2/generic-services/creds/generic-user/XXXXXXXXXXXXXXXXXXXXXXXX"}
{"@level":"error","@message":"failed to create token","@module":"core","@timestamp":"2025-08-01T00:00:00.000000Z","error":"failed to persist accessor index entry: context canceled"}
{"@level":"info","@message":"revoked lease","@module":"expiration","@timestamp":"2025-08-01T00:00:00.000000Z","lease_id":"aws/example/project/generic-project-env/sts/generic-deployer/XXXXXXXXXXXXXXXXXXXXXXXX"}
To understand Vault's lease management over time, you can count revoked leases per day. This is useful for spotting trends in workload, such as increased activity during business hours or after deployments:
// DAILY BREAKDOWN OF REVOKED LEASES
➜ grep "revoked lease" vault-output.log | jq -r '.["@timestamp"]' | cut -d'T' -f1 | sort | uniq -c | awk '{print $2, $1}'
For finer granularity, zoom into a single day to see hourly patterns. This helps pinpoint intra-day spikes:
// HOURLY BREAKDOWN OF REVOKED LEASES FOR A SPECIFIC DAY
➜ grep "revoked lease" vault-output.log | jq -r '.["@timestamp"]' | grep '^2025-08-01' | cut -d'T' -f2 | cut -d':' -f1 | sort | uniq -c | awk '{print "2025-08-01T" $2 ":00:00", $1}'
For a daily breakdown of specific errors:
// DAILY BREAKDOWN OF 'CONTEXT CANCELED'ERRORS
➜ grep "context canceled" vault-output.log | jq -r '.["@timestamp"]' | cut -d'T' -f1 | sort | uniq -c | awk '{print $2, $1}'
Drill down to rate of hourly errors on a specific day to isolate when issues peak:
// HOURLY BREAKDOWN OF 'CONTEXT CANCELED' ERRORS FOR A SPECIFIC DAY
➜ grep 'context canceled' vault-output.log | jq -r '.["@timestamp"]' | grep '^2025-08-01' | cut -d'T' -f2 | cut -d':' -f1 | sort | uniq -c | awk '{print "2025-08-01T" $2 ":00:00", $1}'
For additional insights, segment a high-error hour into 15-minute buckets. This is particularly helpful for tracing short-lived spikes, such as those from concurrent requests overwhelming Vault:
// BREAKING DOWN AN HOUR INTO 15-MINUTE INTERVALS FOR 'CONTEXT CANCELED' ERRORS
➜ grep "context canceled" vault-output.log | jq -r '.["@timestamp"]' | grep '^2025-08-01T10' | cut -d'T' -f2 | cut -d':' -f2 | awk '{bucket = int($1 / 15) * 15; print bucket}' | sort | uniq -c | awk '{print "2025-08-01T10:" sprintf("%02d", $2) ":00", $1}'
To reveal the most common failure modes associated to a particular error during a suspect period:
// TOP ERRORS ASSOCIATED WITH 'CONTEXT CANCELED' FROM A SPECIFIC TIME PERIOD
➜ grep "context canceled" vault-output.log | jq -r 'select(.["@timestamp"] | startswith("2025-08-01T10")) | .["@message"]' | sort | uniq -c | sort -nr
These techniques empower operators to proactively monitor and optimize Vault environments. For more on Vault logging, refer to the official HashiCorp documentation:
https://developer.hashicorp.com/vault/tutorials/monitoring/troubleshooting-vault#vault-server-logs