Introduction
Problem
Vault Rate-limiting does not begin to take effect until approximately 3x the specified rate:
Example Configuration:
{
"path": "<application namespace>",
"rate": 500,
"interval": "3600s",
"block_interval": "1800s"
}
When we apply a rate (for instance 500 rate, duration of 60 mins, block duration of 30mins) to a namespace and test making API calls to that namespace e.g. via CLI to get PKI secrets or KV secrets or just to login via UI, the limiting does not kick in at the rate we set.
In the example, we are able to make API call up to around 1500 before any blocks happen and then we see the 429 rate limit exceeded message appear.
Even if we do low rates for example 8 API for duration of 10mins. The limiting doesn't kick in until we make about 27-30 API calls.
Cause
- Vault Rate limiting applies to each node in the cluster (assuming you're using perf standby nodes). So it would be rate * # of nodes IF you're doing round robin for e.g. with a load balancer. The rate applies to each node in the cluster (assuming you're using perf standby nodes). So it would be rate * # of nodes IF you're doing round robin. If there are 3 nodes and you round robin perfectly, (assuming a load balancer is being used), then you're likely to start seeing limits at 3x the defined rate. Note: If you are using DNS round robin with your cluster you will not see an even distribution- you may see lopsided distribution such as 80/20, 60/40, or even 90/10 in some cases.
Solutions:
-
Increase the rate to take into the consideration the number of nodes as the rate applies per node.
Outcome
429 rate limit exceeded message will begin to appear as expected:
e.g.:
URL: GET http://localhost:8200/v1/sys/seal-status
Code: 429. Errors:* request path "sys/seal-status": rate limit quota exceeded