This article introduces ways to troubleshoot Vault auto snapshot Issues.
- Vault Enterprise
- Vault Automated Snapshots
Overview of possible solutions
- First, do a current list on Automated Snapshot configuration and see what is configured first:
vault list sys/storage/raft/snapshot-auto/config
If the list returns something, this means there are existing snapshots already configured. They might not be working and are in bad status. You can check the status by running the following command (Please replace <key> from the list returned above. ):
vault read sys/storage/raft/snapshot-auto/status/<key>
You will see information such as below:
vault read sys/storage/raft/snapshot-auto/status/gcs Key Value --- ----- consecutive_errors 0 last_snapshot_end 2022-05-13T00:21:59Z last_snapshot_error n/a last_snapshot_start 2022-05-13T00:21:58Z last_snapshot_url https://storage.googleapis.com/test-vault/testvault-snapshot-1652401318922871403.snap next_snapshot_start 2022-05-13T00:41:59Z snapshot_start 2022-05-13T00:21:58Z snapshot_url https://storage.googleapis.com/test-vault/testvault-snapshot-1652401318922871403.snap
These are the information to look for, especially checking when the last snapshot ended, where it is stored (url), whether there is an error, and when does the next snapshot starts.
- If the existing snapshot runs into an error and if you wish to delete it and reconfigure, please run the following command to delete:
vault delete sys/storage/raft/snapshot-auto/config/<key>
- The Vault Operational Logs in trace mode also shows the snapshot activities such as below:
2022-05-13T00:19:58.752Z [TRACE] core.snapshotmgr.gcs: starting snapshot runner: name=gcs interval=20m0s storage_type=google-gcs last_snapshot_start="" next_scheduled_snapshot=2022-05-13T00:21:58Z
2022-05-13T00:21:58.752Z [INFO] core.snapshotmgr.gcs: taking auto snapshot 2022-05-13T00:21:58.922Z [INFO] storage.raft: starting snapshot up to: index=1357 2022-05-13T00:21:58.924Z [INFO] storage.raft: snapshot complete up to: index=1357
2022-05-13T00:21:59.143Z [DEBUG] core.snapshotmgr.gcs: snapshot complete: name=gcs elapsed=390.721446ms size=0
- Note that, if the data inside Vault is huge, please do pay attention to the interval and retain values. Some common issues caused by misconfiguration including configuring an interval that is too less and caused the next snapshot to be scheduled before the last snapshot have a chance to complete, or that the retain value to either too big overfilling the cloud storage.