Terraform Cloud Agents Running Out of Memory – HashiCorp Help Center

Overview

Terraform Cloud Agents may fail during runs if they exhaust available memory on the host system. This typically presents as sudden agent crashes, stalled runs, or Terraform operations terminating unexpectedly during plan or apply.

This article explains common symptoms, root causes, how to diagnose memory-related failures, and steps to remediate and prevent them.

Symptoms

One or more of the following may be observed:
    •    Terraform runs stuck in “Planning” or “Applying” before failing
    •    Runs failing without clear Terraform errors
    •    Agent logs showing:
     •    OOMKilled
     •    signal: killed
     •    exit code 137
    •    Agent container or process restarting unexpectedly
    •    Host-level alerts indicating high memory usage
    •    Terraform Cloud UI showing the agent as offline intermittently

Common Causes

1. Large Terraform Plans
    •    Large state files
    •    Extensive use of for_each or count
    •    Many modules or deeply nested modules
    •    Large provider schemas (e.g., AWS, Azure, Google)

Terraform loads the entire dependency graph into memory during planning.

2. Insufficient Host Memory
    •    Agent running on a VM or container with limited RAM
    •    Multiple agents or workloads competing for memory on the same host
    •    Container memory limits set too low (Docker/Kubernetes)

3. Provider Behavior

Some providers are memory-intensive, especially when:
    •    Refreshing many resources
    •    Using data sources that enumerate large APIs
    •    Managing many resources in a single workspace

4. Parallelism Settings

High parallelism increases memory usage:
• Default Terraform parallelism is 10
• Providers may internally parallelize additional operations

How to Diagnose

Step 1: Check Terraform Cloud Run Logs

Look for abrupt termination or missing error output near the end of a plan/apply.

Step 2: Review Agent Logs

For Docker-based agents:

docker logs terraform-agent

For Kubernetes:

kubectl logs <agent-pod-name>

Look for:
    •    OOMKilled
    •    Killed
    •    Memory allocation failures

Step 3: Check Host Memory Usage

On the agent host:

free -h
top

For containers:

docker stats
kubectl describe pod <agent-pod-name>

Resolution

Option 1: Increase Available Memory (Recommended)

VM / Bare Metal
• Increase RAM on the agent host

Docker

docker run --memory=8g --memory-swap=8g ...

Kubernetes

resources:
 requests:
   memory: "4Gi"
 limits:
   memory: "8Gi"

Terraform agents commonly require 4–8 GB RAM, and large environments may need more.

Option 2: Reduce Terraform Parallelism

Set lower parallelism in the workspace or run configuration:

terraform plan -parallelism=5

Or via environment variable:

TF_CLI_ARGS_plan="-parallelism=5"
TF_CLI_ARGS_apply="-parallelism=5"

Option 3: Reduce Plan Size

    •    Split large configurations into multiple workspaces
    •    Break monolithic states into smaller components
    •    Avoid unnecessary data sources
    •    Limit use of terraform refresh-heavy patterns

Option 4: Reduce Concurrent Agent Workloads

    •    Run fewer agents per host
    •    Ensure agents are not colocated with other memory-heavy services
    •    Use autoscaling (for Kubernetes-based agents)

Prevention & Best Practices

    •    Allocate at least 4 GB RAM per agent, more for large environments
    •    Avoid running multiple agents on small hosts
    •    Monitor memory usage with alerts
    •    Periodically review workspace size and complexity
    •    Use multiple smaller workspaces instead of a single large one
    •    Pin provider versions to avoid unexpected memory regressions