How to Resize Terraform Enterprise Volumes When Disk Space Is Full – HashiCorp Help Center

Problem

Terraform Enterprise is running in mounted disk operational mode and a volume containing application data has reached capacity. This is beginning to cause performance issues manifesting in errors similar to the following:

[root@ip-10-0-17-195 ~]# journalctl | grep "no space left on device"
Feb 27 21:48:37 ip-10-0-22-107 docker[4541]: ERRO 2023-02-27T21:48:37+00:00 taskqueue/task.go:319 write /var/lib/replicated/db/goleveldb_data/000001.log: no space left on device
...

Cause

HashiCorp provides a recommended base size for the backing storage when using mounted disk operational mode. When an instance fails to meet those size requirements or an organization's usage has simply outgrown the configured storage capacity, Terraform Enterprise will experience performance issues which can lead to downtime.

Solution

This issue can be temporarily resolved by increasing the size of the disk in question and resizing the relevant partition and file system. As Terraform Enterprise can be deployed in multiple cloud or on-prem environments, implementation of these steps may differ slightly. A general procedure is outlined below however, where appropriate, links to external cloud provider documentation are provided and should be consulted for details on their respective storage solutions.

Procedure

Identify which filesystem is out of space using the df utility:

[root@ip-10-0-17-195 ~]# df -hT
Filesystem   Type     Size  Used Avail Use% Mounted on
devtmpfs     tmpfs    2.0G     0  2.0G   0% /dev
tmpfs        tmpfs    2.0G     0  2.0G   0% /dev/shm
tmpfs        tmpfs    2.0G  2.7M  2.0G   1% /run
tmpfs        tmpfs    2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/xvda1   xfs       50G   50G  651M  99% /
tmpfs        tmpfs    392M     0  392M   0% /run/user/1000
overlay      overlay   50G   50G  651M  99% /var/lib/docker/overlay2/3e0c95ad8522c3b22991abc051527e44f7a592279df39c7010c5336a955caa94/merged
...

In the example above, /dev/xvda1 has run out of its 50G of disk space. In mounted disk operational mode, Terraform Enterprise stores application data in the directory specified under the disk_path setting.

[root@ip-10-0-17-195 ~]# replicatedctl app-config export --template '{{ .disk_path.Value }}'
/opt/terraform-enterprise

Inspecting the disk usage stats of that directory shows that TFE usage has outgrown the provisioned disk size.

[root@ip-10-0-17-195 ~]# du -hs /opt/terraform-enterprise/
41G /opt/terraform-enterprise/

After identifying the disk which is out of space, it is essential to take a snapshot in case the upcoming changes need to be rolled back. Refer to the relevant cloud provider documentation below to take a snapshot of the disk:

For example, with AWS this can be done be navigating to the EC2 console, selecting Snapshots, and clicking Create snapshot:

Screenshot_2023-03-07_at_11.26.21_AM.png

Increasing the size of the mounted disk will require adjusting the disk's settings via the cloud provider's console, CLI, or API. Refer to the relevant cloud provider documentation below to increase the size of the disk:

In AWS, this will be done by navigating to EC2 -> Volumes, selecting the volume attached to the instance, selecting Modify, and increasing the size.

Screenshot_2023-03-06_at_8.03.01_PM.png

After the size of the disk has been increased, access the Terraform Enterprise instance and confirm the new size:

[root@ip-10-0-17-195 ~]# lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
xvda    202:0    0  100G  0 disk
└─xvda1 202:1    0   50G  0 part /

Extend the partition using the growpart or parted utilities:

[root@ip-10-0-17-195 ~]# growpart /dev/xvda 1
CHANGED: partition=1 start=4096 old: size=104853471 end=104857567 new: size=209711071 end=209715167

Confirm the partition was extended:

[root@ip-10-0-17-195 ~]# lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
xvda    202:0    0  100G  0 disk 
└─xvda1 202:1    0  100G  0 part /

Expand the size of the mounted filesystem using one of the following utilities, depending on the filesystem type (confirm with df -T):

xfs: xfs_growfs -d <MOUNTPOINT>
ext4: resize2fs <DEVICE>

[root@ip-10-0-17-195 ~]# xfs_growfs -d /
meta-data=/dev/xvda1             isize=512    agcount=26, agsize=524159 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=0, rmapbt=0
         =                       reflink=0    bigtime=0 inobtcount=0
data     =                       bsize=4096   blocks=13106683, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 13106683 to 26213883

View the file system disk space usage to confirm the change:

[root@ip-10-0-17-195 ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        2.0G     0  2.0G   0% /dev
tmpfs           2.0G     0  2.0G   0% /dev/shm
tmpfs           2.0G  2.7M  2.0G   1% /run
tmpfs           2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/xvda1      100G   50G   51G  50% /
tmpfs           392M     0  392M   0% /run/user/1000
overlay         100G   50G   51G  50% /var/lib/docker/overlay2/3e0c95ad8522c3b22991abc051527e44f7a592279df39c7010c5336a955caa94/merged
...

Closing

While the steps above can be used as a temporary solution when facing performance issues or downtime, it is important that an organization evaluate its usage patterns to make long term remediations. For example, migrating to an external services operational mode might make sense for scalability, depending on the needs of the organization. Additionally, installing system-level monitoring or configuring alerts in Cloudwatch, Azure Monitor, or Google Cloud Monitoring can help anticipate low disk space and allow organizations time to perform necessary system maintenance and avoid downtime.

Problem

Cause

Solution

Procedure

Closing

Articles in this section

Related articles