Introduction
When utilizing Vault telemetry there is the ability to set a metrics prefix that will be used when writing all telemetry data. When set, instead of using
vault_
, metrics will start with the prefix configured. Along with the Metrics Prefix, prefix_filter
allows for filtering metrics that are published from Vault.Prerequisites
This filtering feature was added in Vault 1.4. Within the telemetry stanza, the parameters discussed in this article are
metrics_prefix
, prefix_filter
, and filter_default
. If utilizing the filter_prefix
it is important to also set filter_default
accordingly. Taking this one step further, if utilizing the metrics_prefix
this must also be set accordingly within the filter_prefix
values set.
prefix_filter
: This is a list of filter rules to apply for allowing/blocking metrics by prefix in the following format. A leading "+" will enable any metrics with the given prefix, and a leading "-" will block them. If there is overlap between two rules, the more specific rule will take precedence. Blocking will take priority if the same prefix is listed multiple times.
metrics_prefix
: The prefix used while writing all telemetry data. By default, this is set to "vault".
filter_default
: This controls whether to allow metrics that have not been specified by the filter. Defaults to true, which will allow all metrics when no filters are provided. When set to false with no filters, no metrics will be sent.
Procedure
Vault provides rich operational telemetry metrics that can be consumed by popular solutions for monitoring and alerting on key operational conditions and audit devices for logging each Vault request and response. The telemetry stanza within the Vault configuration file will help configuring data that will be published to other systems.
Note that if using
prefix_filter
, that filter_default
will need to be set to false
. Also, if utilizing metrics_prefix
this must be included in the prefix_filter
as well. Below is an example of a telemetery stanza:telemetry {
prometheus_retention_time = "30s"
disable_hostname = false
filter_default = false
metrics_prefix = "PREFIX-TEST"
prefix_filter = [
"+PREFIX-TEST.audit.log_request",
"+PREFIX-TEST.audit.log_request_failure",
"+PREFIX-TEST.audit.log_response",
"+PREFIX-TEST.audit.log_response_failure"
]
}
Metrics will now be published with the
PREFIX-TEST
prefix and only publish metrics associated to auditing.
Example of a telemetry dump showing these changes:
[2022-08-05 15:50:20 -0400 EDT][C] 'PREFIX-TEST.audit.log_request_failure': Count: 1 Sum: 0.000 LastUpdated: 2022-08-05 15:50:25.774037 -0400 EDT m=+25.661173095
[2022-08-05 15:50:20 -0400 EDT][C] 'PREFIX-TEST.audit.log_response_failure': Count: 1 Sum: 0.000 LastUpdated: 2022-08-05 15:50:25.777195 -0400 EDT m=+25.664331736
[2022-08-05 15:50:20 -0400 EDT][S] 'PREFIX-TEST.audit.log_request': Count: 1 Sum: 7.650 LastUpdated: 2022-08-05 15:50:25.776444 -0400 EDT m=+25.663580938
[2022-08-05 15:50:20 -0400 EDT][S] 'PREFIX-TEST.audit.log_response': Count: 1 Sum: 0.726 LastUpdated: 2022-08-05 15:50:25.777232 -0400 EDT m=+25.664368804
[2022-08-05 15:50:30 -0400 EDT][C] 'PREFIX-TEST.audit.log_request_failure': Count: 2 Sum: 0.000 LastUpdated: 2022-08-05 15:50:35.765455 -0400 EDT m=+35.652379132
[2022-08-05 15:50:30 -0400 EDT][C] 'PREFIX-TEST.audit.log_response_failure': Count: 2 Sum: 0.000 LastUpdated: 2022-08-05 15:50:35.765566 -0400 EDT m=+35.652490573
[2022-08-05 15:50:30 -0400 EDT][S] 'PREFIX-TEST.audit.log_request': Count: 2 Min: 0.124 Mean: 0.288 Max: 0.453 Stddev: 0.233 Sum: 0.577 LastUpdated: 2022-08-05 15:50:35.765475 -0400 EDT m=+35.652399437
[2022-08-05 15:50:30 -0400 EDT][S] 'PREFIX-TEST.audit.log_response': Count: 2 Min: 0.074 Mean: 0.079 Max: 0.084 Stddev: 0.007 Sum: 0.158 LastUpdated: 2022-08-05 15:50:35.765574 -0400 EDT m=+35.652499079
Example of a telemetry dump utilizing just the
prefix_filter
with the default vault
prefix:
[2022-08-05 16:00:50 -0400 EDT][C] 'vault.audit.log_request_failure': Count: 1 Sum: 0.000 LastUpdated: 2022-08-05 16:00:55.779619 -0400 EDT m=+10.713046541
[2022-08-05 16:00:50 -0400 EDT][C] 'vault.audit.log_response_failure': Count: 1 Sum: 0.000 LastUpdated: 2022-08-05 16:00:55.779794 -0400 EDT m=+10.713221629
[2022-08-05 16:00:50 -0400 EDT][S] 'vault.audit.log_request': Count: 1 Sum: 1.345 LastUpdated: 2022-08-05 16:00:55.779658 -0400 EDT m=+10.713085604
[2022-08-05 16:00:50 -0400 EDT][S] 'vault.audit.log_response': Count: 1 Sum: 0.123 LastUpdated: 2022-08-05 16:00:55.779813 -0400 EDT m=+10.713240497
Additional Information:
- Telemetry Stanza Configuration: https://developer.hashicorp.com/vault/docs/configuration/telemetry
-
Telemetry Metrics Learn Guide: https://learn.hashicorp.com/tutorials/vault/telemetry-metrics-reference