Problem
After migrating from a standalone Terraform Enterprise (TFE) installation to an Active/Active installation on version v202302-1, runs in most workspaces fail with the following error when using the "Remote" execution mode.
Operation failed: failed running terraform plan (exit -1)
Workspaces configured to use private or custom agents are not affected by this issue.
Cause
This issue is caused by a known problem in TFE version v202302-1 related to the agent run pipeline mode.
Solutions
There are two potential solutions to resolve this issue.
Solution 1: Revert to Legacy Worker Mode
You can switch the run pipeline mode to legacy to bypass the issue with the agent mode.
For a standalone TFE installation, run the following commands to apply the configuration.
$ replicatedctl app-config set runpipelinemode --value 'legacy' $ replicatedctl app apply-config
For an Active/Active TFE installation, run the following command.
$ tfe-admin app-config -k runpipelinemode -v 'legacy'
After reverting, if some runs fail with a killed status, check the support bundle for OOM (Out of Memory) killer logs to confirm that the operating system terminated the Terraform process.
terraform invoked oom-killer
If you find these errors, proceed to Solution 2 to adjust worker capacity.
Solution 2: Adjust Worker Capacity Settings
If the OOM killer is terminating runs, you must adjust the worker capacity settings to meet your system's needs. For more details on these settings, refer to the Capacity and Performance Guide.
For an Active/Active installation, run the following commands to set the memory, concurrency, and CPU capacity. Replace <value> with the appropriate setting for your environment.
$ tfe-admin app-config -k capacity_memory -v <value> $ tfe-admin app-config -k capacity_concurrency -v <value> $ tfe-admin app-config -k capacity_cpus -v <value>
After adjusting the capacity, restart TFE for the changes to take effect.
Outcome
After applying one of the solutions, workspace runs should complete successfully.
Additional Information
If you need to switch back to the agent run pipeline mode after resolving the underlying issue, you can do so by setting the runpipelinemode value to an empty string.
For a standalone TFE installation, run these commands.
$ replicatedctl app-config set runpipelinemode --value '' $ replicatedctl app apply-config
For an Active/Active TFE installation, run this command.
$ tfe-admin app-config -k runpipelinemode -v ''
Please refer to the following official documentation for more details: