Error: timeout while waiting for plugin to start – HashiCorp Help Center

Introduction

This article describes how to verify if your Terraform error is related to not having enough memory to complete the task.

Problem

A run in Terraform Cloud or Terraform Enterprise might fail with the following error message

|
| Error: timeout while waiting for plugin to start
|

Cause

This error message could be caused by the worker executing the code in Terraform Cloud/Terraform Enterprise not having enough memory to complete the run.

Please follow the below steps to verify this in the logging

Go to the workspace that is having the issues
Under Variables -> add the following variable
- Environment variable
- Key -> TF_LOG
- Value -> TRACE
Under Settings -> General -> User Interface
- Select Console UI
- Save settings
Start another run that will fail with the same error
Download the run details from the workspace by clicking on the "View raw log" button
Open this file and see if you find a message that has the word "killed" in the sentence like the example below

2023-05-31T11:29:51.574Z [DEBUG] provider: plugin process exited: path=.terraform/providers/registry.terraform.io/hashicorp/aws/4.65.0/linux_amd64/terraform-provider-aws_v4.65.0_x5 pid=254 error="signal: killed"

Solution 1:

Use less providers and resources in your code. Split these over multiple workspaces. This will cause a run to use less memory.

Solution 2:

For Terraform Enterprise you can increase the allowed memory a worker can use. This can be done using capacity management as documented here

Verify the current allocated memory a worker is allowed to use

replicatedctl app-config export --template "{{.capacity_memory.Value}}"

Change the value to more memory. For example 2048MB

replicatedctl app-config set capacity_memory --value "2048"

Restart your TFE application for the settings to take effect

replicatedctl app stop
replicatedctl app start

Do another run to verify if the issue is resolved

Please be aware of the following

The default memory limit for the Terraform worker container is 512MB. Multiply that by the default concurrency limit of 10 and you'll need 5GB of memory just for Terraform runs.Then you'll need another 4GB for Terraform Enterprise and some left over for the OS.

If you increase your memory limit from 512MB to 2048MB, be sure to keep in mind that you'll still need 4GB for Terraform Enterprise and another few GB for the OS. You may need to increase the amount of RAM on your server. If you don't you will get errors again and performance issues.

To give you an idea of the sizing

16GB RAM for 10 concurrency and 512MB memory limit
24GB RAM for 10 concurrency and 1024MB memory limit
32GB RAM for 10 concurrency and 2048MB memory limit

Outcome

After making the changes suggested the run should complete without issue. If you are still seeing issues please create a ticket with HashiCorp support and add the logfile of the failed run on the ticket.

Additional Information

Capacity management in our documentation can be found here