Problem
The runs complete normally without issues, however after every run on a Terraform Cloud Agent the Kubernetes pod restarts.
In the events you see the following:
$ kubectl events -n NAMESPACE
Warning BackOff 46s kubelet Back-off restarting failed container tfc-agent in pod tfc-agent-deployment-57d779f795-sdg2p_terraform-enterprise(da3dc50c-e9ba-4d54-8c77-143ac450333b)
The get pods command shows the following:
$ kubectl -n NAMESPACE get pods
NAME READY STATUS RESTARTS AGE
tfc-agent-deployment-854d77dc4c-68b8g 0/1 CrashLoopBackOff 8 (29s ago) 8m27s
Prerequisites
- Terraform Enterprise Cloud Agents running on Kubernetes
- Cloud Agents defined to run in Single mode
Cause
By running the Terraform Cloud Agents in single mode, the agents get stopped after every run.
Example configuration used with single mode
apiVersion: apps/v1
kind: Deployment
metadata:
name: tfc-agent-deployment
namespace: terraform-enterprise-agents
labels:
app: tfc-agent
spec:
replicas: 1
selector:
matchLabels:
app: tfc-agent
template:
metadata:
labels:
app: tfc-agent
spec:
containers:
- name: tfc-agent
image: hashicorp/tfc-agent:1.16.0
env:
- name: TFC_AGENT_TOKEN
value: "<secret-token>"
- name: TFC_ADDRESS
value: "https://<fqdn>"
- name: TFC_AGENT_NAME
value: "agent"
- name: TFC_AGENT_SINGLE
value: "true"
- name: TFC_AGENT_AUTO_UPDATE
value: "disabled"
Kubernetes sees this and restarts the pod, as intended. With every restart the restart delay increases, up to 5 minutes.
This is default Kubernetes restart behaviour and outside of HashiCorp control.
Solutions
By removing the setting for Single mode, the agent will keep running after a terraform run and not be restarted.
Outcome
Terraform Cloud Agents will keep running and Kubernetes won't restart them after every run.