- Cluster architecture; including node configuration/sizing, how they're connected, where they are, load balancers, etc
- Workload - by auth engine/secret engine, in peak transactions per second expected
- Todo: Examples of this
- Audit log configuration (ie, output of $ vault audit list --detailed )
-
$ vault audit list --detailed
Path Type Description Replication Options
---- ---- ----------- ----------- -------
file/ file n/a replicated -local=true file_path=/var/log/vault/audit.log
-
- Disk layout on the nodes (ie, output of $ df -hlT )
-
$ df -hlT
Filesystem Type Size Used Avail Use% Mounted on
udev devtmpfs 5.9G 0 5.9G 0% /dev
tmpfs tmpfs 1.2G 3.3M 1.2G 1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv ext4 915G 761G 108G 88% /
/dev/sda3 ext4 976M 303M 607M 34% /boot
/dev/sdb2 ext4 1.2T 989M 213G 88% /vault-data
/dev/sdb1 ext4 1.2T 100M 1.1T 11% /vault-audit-log
-
- IOPS from each node
- Latency between nodes
- For the period of slowdown/unresponsiveness
- Disk metrics (ie, iostat or dstat)
- CPU
- Memory
- Vault metrics (non-exhaustive list):
- vault_raft_commitTime
- vault_raft_storage_put/get
- vault_barrier_put
- vault_barrier_get
- vault_token_create
- vault_token_creation
- vault_token_createAccessor
- vault_runtime_gc_pause_ns
- vault_core_handle_request
- vault_token_count_by_auth
-
vault_expire_num_leases
-
vault_expire_renew
-
vault_expire_revoke
-
vault_token_count
-
vault_token_count_by_ttl
-
vault_secret_kv_count
Cluster Architecture


Node Configuration
node-1 (voter) | node-2 (voter) | node-3 (voter) | node-4 (non-voter) | node-5 (non-voter) | |
CPU | 4x vCPU | 4x vCPU | 4x vCPU | 4x vCPU | 4x vCPU |
Memory | 16GB | 16GB | 16GB | 16GB | 16GB |
Primary Disk | 80GB GP3, 9,000 IOPS | 80GB GP3, 9,000 IOPS | 80GB GP3, 9,000 IOPS | 40GB GP3, 3,000 IOPS | 40GB GP3, 3,000 IOPS |
Secondary Disk | 200GB GP3, 3,000 IOPS | 200GB GP3, 3,000 IOPS | 200GB GP3, 3,000 IOPS | None | None |
Network | 10GB | 10GB | 10GB | 10GB | 10GB |
etc... |
Node to Node Latency
A ongoing ping test can show latency and the consistency of a network path. It is not a comprehensive monitor, but will usually show spikes and drops if they are present. Allowing ping to run during periods of heavy activity can help find potential network issues.
$ ping 192.168.1.215
PING 192.168.1.215 (192.168.1.215) 56(84) bytes of data.
64 bytes from 192.168.1.215: icmp_seq=1 ttl=64 time=0.258 ms
64 bytes from 192.168.1.215: icmp_seq=2 ttl=64 time=0.176 ms
64 bytes from 192.168.1.215: icmp_seq=3 ttl=64 time=0.145 ms
64 bytes from 192.168.1.215: icmp_seq=4 ttl=64 time=0.203 ms
...
The most simple test outside of a basic ping test is to run traceroute. This will show if the network configuration has extra/unknown hops. It is recommended that the nodes be close together with latency not exceeding 8ms.
$ traceroute 192.168.1.215
traceroute to 192.168.1.215 (192.168.1.215), 64 hops max
1 192.168.1.215 0.216ms 0.121ms 0.175ms
This document does not cover testing bandwidth between nodes. A popular utility for this is iperf (https://iperf.fr/).
While a cluster will generally misbehave or not operate if required ports are not open, checking connectivity from each node, each direction, to all its peers can show one-off configuration issues. This can be done with netcat, which is usually an allowed utility on a node where telnet might be prohibited.
$ nc -vz 192.168.1.215 8200
Connection to 192.168.1.215 8200 port [tcp/*] succeeded!
$ nc -vz 192.168.1.215 8201
Connection to 192.168.1.215 8201 port [tcp/*] succeeded!
Baseline IOPS
A note on AWS Storage Many deployments and IaC are configuring clusters with GP2 disks with provisioned IOPS. Since early 2021 with the release and updated pricing of GP3 storage within AWS, we recommend using GP3 disks. GP3 will be cheaper and outperform GP2 in nearly all use cases. It also provides a baseline of 3,000 IOPS at any size, removing the need to over-provision storage sizes to get IOPS on GP2 disks. In-place/hot-migrations can be performed to migrate from GP2 to GP3. For more information, see this AWS document: https://aws.amazon.com/blogs/storage/migrate-your-amazon-ebs-volumes-from-gp2-to-gp3-and-save-up-to-20-on-costs/ |
$ sudo apt install fio
$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test \
--filename=test --bs=4k --iodepth=64 --size=500MB --readwrite=randrw --rwmixread=10
read: IOPS=181, BW=725KiB/s (742kB/s)(50.3MiB/71049msec)
write: IOPS=1620, BW=6481KiB/s (6637kB/s)(450MiB/71049msec); 0 zone resets
iops : min= 408, max= 9304, avg=1620.62, stdev=1324.40, samples=142
cpu : usr=0.78%, sys=3.64%, ctx=32126, majf=0, minf=108
Disk Performance Considerations
Vault running with integrated storage is disk intensive. This is a shift in operation from Vault using Consul as backend storage, where Consul was more memory dependent.
While Vault has a Least Recently Used (LRU) cache for certain reads, random or unknown workloads can still be very dependent on disk performance for reads. Obviously, high write workloads will also impact this. This reiterates the need to understand the consumers and use cases of Vault to plan for how much disk activity a node will need. Absent of a known workload, monitoring IOPS and read/write patterns once a cluster is deployed is highly recommended.
Audit Logging
If following best-practices and using Vault's audit log to capture all requests and responses to Vault, audit log impacts to disk IO must be considered. If a Vault node cannot write to any of the configured audit devices, it will return an error and not service the request.
A typical audit log entry can be 1kb-3kb, meaning a node servicing 10,000 requests an hour can write 10-30mb of data. Log rotation and transfer to offline storage should be automated. Rotation of logs is important as not to fill up the disk, and transfer of the log to external storage is required in the event the node loses a disk or is compromised.
Audit logging directly impacts throughput of the node. Audit logging currently (1.8.x) is a single threaded operation and very large workloads (> 10,000 requests per second) can overwhelm the underlying OS systems ability to keep up. Horizontal scaling with performance standby nodes can add capacity to your Vault cluster in this scenario.
In addition to the low-level limits impacting Vault's ability to write to the audit device, the storage device that the audit log is being written to also must have the IOPS to support these audit log writes. If you have a high volume workload, ie over 1,000 requests per second, it is recommended to place the Vault integrated storage data path and the audit log (or your syslog location) on separate disks to reduce IO contention.
Monitoring of the node's await and iowait is recommended:
As well as Vault's telemetry for integrated storage performance: