Prerequisites
This problem may occur for other Providers and will occur for various configurations. The details below reflect one example.
Problem
While deploying Terraform configuration to create a PostgreSQL Database, PostgreSQL Server, and AzureRM Storage Account, the following error is encountered mid-way.
plugin.terraform-provider-azurerm_v2.66.0_x5: AzureRM Response for https://management.azure.com/subscriptions/xxxx-xx-xxxx-xxxx/resourceGroups/test-rg/providers/Microsoft.DBForPostgreSQL/servers/test-server?api-version=2017-12-01:
HTTP/1.1 409 Conflict
Content-Length: 135
Cache-Control: no-cache, proxy-revalidate
Connection: Keep-Alive
Content-Type: application/json; charset=utf-8
Date: Thu, 08 Jul 2021 14:53:29 GMT
Expires: -1
Pragma: no-cache
Proxy-Connection: Keep-Alive
Server: Microsoft-HTTPAPI/2.0
Strict-Transport-Security: max-age=31536000; includeSubDomains
X-Content-Type-Options: nosniff
X-Ms-Correlation-Request-Id: xxxxxxxx
X-Ms-Ratelimit-Remaining-Subscription-Writes: 1197
X-Ms-Request-Id: xxxxxxxx
X-Ms-Routing-Request-Id: CANADACENTRAL:xxxxxxxx
{"error":{"code":"ServiceBusy","message":"Service is temporarily busy and the operation cannot be performed. Please try again later."}}:
[DEBUG] module.postgres-sql.azurerm_postgresql_server.main: apply errored, but we're indicating that via the Error pointer rather than returning it: creating PostgreSQL Server "test-server" (Resource Group "test-rg"): postgresql.ServersClient#Create: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="ServiceBusy" Message="Service is temporarily busy and the operation cannot be performed. Please try again later."
[TRACE] EvalMaybeTainted: module.postgres-sql.azurerm_postgresql_server.main encountered an error during creation, so it is now marked as tainted
Cause
Firewall HTTP RFC Retry-After
response-header field contained the type 202 (Accepted)
. This causes the PUT
calls to be flagged by the Firewall and marked a threat, blocking further retries.
PUT /subscriptions/xxxx-xx-xxx-xxxxx/resourceGroups/test-rg/providers/Microsoft.DBForPostgreSQL/servers/test-server/databases/test-db?api-version=2017-12-01 HTTP/1.1
HTTP/2.0 202 Accepted
Adding a time_sleep
resource does not work because the root cause is not related to waiting for the Service to become available again.
Solution
Check with a Server Administrator to ensure the Firewall HTTP RFC Retry-After
response-header field is set with all appropriate types. The administrator of this server should validate why this field includes type 202 (Accepted)
responses and with their approval proceed with removing it. As per Microsoft they confirmed that "the Retry-After
in the response header of the PUT
request is by design."
Outcome
After exempting the 202 (Accepted)
on the HTTP RFC Retry-After
response-header field, the deployment worked as expected and the 409 Conflict
response and "ServiceBusy"
error from Azure did not repeat.
Additional Information
Retry-After General Reference
The Retry-After
response-header field can be used with a 503 (Service Unavailable)
response to indicate how long the service is expected to be unavailable to the requesting client. This field MAY also be used with any 3xx (Redirection)
response to indicate the minimum time the user-agent is asked to wait before issuing the redirected request. The value of this field can be either an HTTP-date or an integer number of seconds (in decimal) after the time of the response.