Introduction
When using Vault telemetry and attempting to export your telemetry data from within a GKE cluster, there will be issues using the HashiCorp guides on using Telegraf to export to Splunk. This is due to limitations in GKE allowing Telegraf to access the metrics in the Vault node. Node level metrics such as CPU, Memory usage, etc., are exported but the Vault telemetry metrics will not export to Splunk.
Prerequisites (if applicable)
Telemetry enabled on your Vault nodes.
After the above has been enabled, verification that Vault is exporting metrics. This can be done on Unix-style operating systems with kill -USR1 $(pgrep vault)
while on Windows it is BREAK
.
Cause
Using Telegraf as outlined in the linked document above is not possible due to security restrictions within GKE. This does not include using a sidecar, only using an external instance. Telegraf will be able to collect instance level metrics, such as CPU, Memory, and others, but not the Vault telemetry metrics. Using a sidecar will allow you to export the metrics as expected.
Solution:
Configure Stackdriver to export to Google Cloud Monitoring. Additional configuration can be used to then export to Splunk or other applications from there.