The information contained in this article has been verified as up-to-date on the date of the original publication of the article. HashiCorp endeavors to keep this information up-to-date and correct, but it makes no representations or warranties of any kind, express or implied, about the ongoing completeness, accuracy, reliability, or suitability of the information provided.
All information contained in this article is for general information purposes only. Any reliance you place on such information as it applies to your use of your HashiCorp product is therefore strictly at your own risk.
Introduction
Managing storage effectively is crucial when running Consul, especially in virtualized environments where resources are often constrained. One common challenge arises when creating Consul snapshots. Consul utilizes the machine's temporary directory (typically /tmp
on Linux) to construct these snapshots. However, in an effort to optimize VM costs, administrators may limit the size of the /tmp
directory. This can pose a significant problem if the allocated space, for example, 2GB, is insufficient to accommodate a large snapshot, such as one generated from a 4GB RaftDB.
Background
How Consul uses Go to specify its Raft snapshot temp directory
- The default location can vary depending on the operating system, but typically it is
/tmp
. You can get more detailed information on default locations in the Go documentation for os.TempDir. - If you need to change this location, you can do so by setting the
TMPDIR
environment variable for the Consul server processes. Keep in mind that setting the environment variable for the CLI client attempting to perform a snapshot save will have no effect. It must be set in the context of the server process. If you're usingsystemd
to manage your Consul server processes, then addingEnvironment=TMPDIR=/path/to/dir
to your Consul unit file will work.
Applicability
- Temp directory capacity is full, and normal Consul raft snapshot backups are failing with:
Error saving snapshot: Unexpected response code: 500 (failed to create snapshot file: open /tmp/snapshot3662158544: disk full
Expected Outcome
By ensuring that Consul has sufficient storage capacity for the snapshot, users can expect the following outcomes:
- Successful snapshot creation: Snapshots will complete without encountering disk space errors, allowing for consistent backups and disaster recovery.
- Reliable service operation: Preventing snapshot failures helps maintain the stability and availability of Consul services.
- Improved operational efficiency: Eliminating snapshot-related issues reduces troubleshooting time and administrative overhead.
Prerequisites
System Level admin access, as this will be required when performing a rolling restart of the Consul Server agents
WARNING: This operation requires a rolling restart of Consul server agents, triggering a leadership switch when the existing leader is restarted. Based on your organization's policies, ensure that you have engaged the relevant change management process if any. |
Procedure
Consul utilizes the TMPDIR
environment variable to determine the directory for temporary files, including snapshot backups. To change the snapshot storage location, simply update this variable within the Consul process environment.
-
Determine the Operating Systems default TMPDIR setting
dirname $(mktemp -u -t tmp.XXXXXXXXXX)
-
Determine Consul's
systemd
unit environment settings
sudo systemctl show consul | grep ^Environment
Managing Consul Snapshots with systemd
Consul's systemd service uses an environment file (/etc/consul.d/consul.env
) to set runtime variables. We can leverage this to change the snapshot directory.
-
Add the
TMPDIR
variable:- Insert the following line, replacing
/path/to/your/directory
with your desired snapshot location (ensure the directory exists and has sufficient permissions):
echo 'TMPDIR=/path/to/your/directory' >> /etc/consul.d/consul.env
- Save the file.
- Insert the following line, replacing
-
Reload the systemd daemon:
sudo systemctl daemon-reload
-
This ensures systemd picks up the changes to the environment file.
-
-
Restart the Consul service:
sudo systemctl restart consul
Restart Consul Services
To apply the snapshot directory change, perform a rolling restart of Consul server agents while maintaining quorum (N/2+1). Begin with the followers, and restart the leader last. Utilize your service manager (e.g., systemd, init.d) to restart each agent.
-
Example using
systemd
:
sudo systemctl restart consul.service
Verifying the Updated TMPDIR
To confirm that Consul is using the new temporary directory for snapshots, follow these steps:
-
Trigger a manual snapshot:
consul snapshot save consul_snapshot.snap
-
Check Consul logs:
- Examine the Consul server logs. You should see a debug message indicating the temporary file's location, which should now reflect your updated
TMPDIR
path. For example:2025-01-14T20:29:01.699Z [DEBUG] agent.server: creating temporary file of snapshot: path=/opt/consul/snapshot_temp/snapshot3662148544
- Examine the Consul server logs. You should see a debug message indicating the temporary file's location, which should now reflect your updated
-
(Optional) Monitor the temporary directory:
- For a more dynamic view, use the
watch
command to observe the creation and deletion of the temporary snapshot file:watch -n 0.5 'ls -lah /opt/consul/snapshot_temp'
- This command will list the contents of your
TMPDIR
every 0.5 seconds, allowing you to see the snapshot file appear and disappear as it's created and then used by Consul.-
Example
watch
output:Every 0.5s: ls -lah /opt/consul/snapshot_temp ip-10-0-0-199: Tue Jan 14 17:03:06 2025 total 37M drwxrwxr-x 2 ubuntu ubuntu 4.0K Jan 14 17:03 . drwxrwxrwx 5 consul consul 4.0K Jan 14 17:02 .. -rw------- 1 consul consul 37M Jan 14 17:03 snapshot3886401733
-
- This command will list the contents of your
- For a more dynamic view, use the
Additional Information
- Consul & Raft Consensus Protocol
- Consul CLI Command: consul-snapshot-save