Problem
Terraform Enterprise (TFE) runs intermittently fail with a timeout during plan filesystem uploads for certain workspaces. A typical error message appears as:
Operation failed: failed uploading filesystem: request failed: PUT https://<hostname>/_archivist/v1/object/...: http2: timeout awaiting response headers
Prerequisites (if applicable)
Terraform Enterprise is installed and operational.
Affected workspaces have completed previous runs successfully.
- Network path includes firewalls and load balancers between agents and the TFE application.
Admin access to TFE logs and the ability to run commands inside the TFE container.
Cause
The root cause is a transient network issue at the TCP/IP level, likely involving an external load balancer or firewall in the path between the agent and the TFE application. The upload to object storage is successful, but the agent closes the connection before receiving a response due to the network interruption.
Solutions:
1. Retrieve Plan Filesystem URLs via Rails Console
Inside the TFE container
Run.find_by(external_id: "<run_external_id>").plan.executable_plan.download_url.to_s
Run.find_by(external_id: "<run_external_id>").plan.executable_plan.upload_url.to_s
2. Simulate Upload/Download Using cURL
To reproduce the issue manually from the TFE container:
# Download the plan archive using the download URL
curl -v -o /tmp/test "https://<hostname>/_archivist<DOWNLOAD_URI>"
# Attempt to upload the archive using the upload URL
curl -v \
-X PUT \
-H "Content-Type: application/octet-stream" \
--data-binary '@/tmp/test' \
"https://<hostname>/_archivist<UPLOAD_URI>"
3. Confirm File Size Is Reasonable
Check that the file is not excessively large, which may cause upload issues:
ls -lh /tmp/test
Long-Term Resolution: Enable Hairpin Addressing
-
Long-Term Solution: Enable Hairpin Addressing
- Route internal requests to the TFE hostname directly to the node’s private IP to keep traffic internal and bypass external network devices.
- Update pod spec with hostAliases and set the TFE_RUN_PIPELINE_DOCKER_EXTRA_HOSTS environment variable.
- Ensure the TFE certificate or its root CA is trusted by the system and custom agent images.
Outcome
Once hairpin addressing was configured and network stability was ensured:
Affected workspaces resumed successful runs.
New plan and apply actions completed without issue.
The error did not reoccur, confirming the root cause was due to transient external network interruptions.
Reference