Introduction
Problem
After making a change in an environment, a previously working ACL process is now giving an error like error creating bootstrap configuration for Connect proxy sidecar: exit status 1
This error message is perhaps one of the symptoms of this issue.
It was noticed that the connect_inject_init container was not starting even with full write permissions in the ACL policy for the sidecar proxy service.
Prerequisites (if applicable)
- Using Consul Connect
Cause
Consul version in use was 1.10.4+ent
.
This version supports legacy ACL modes and the client nodes, upon restart, started up the ACL in legacy modes while the rest of the cluster was using ACL in non-legacy mode after restart thereby leaving them in a broken state.
The trace logs did not show any sign of the ACL check loop running to transition out of ACL legacy mode.
Solutions:
Remediation was to consul leave
on SERVER-1, removing it from the gossip ring. Gossip propagated the leave correctly and quickly, allowing clients to come back up with the correct ACL setting.
Outcome
The best way to confirm this discrepancy in ACL modes is by verifying the output of the command consul members -detailed
.
For example, acls = 2
signifies Legacy Mode.
broken-client% consul members --detailed
Node Address Status Tags
...
SERVER-1 [127.0.0.1:8301](http://127.0.0.1:8301/) alive acls=2,ap=default,build=1.9.14:5b3be833, <DETAILS REMOVED>
This GH card explains the issue.
Additional Information
This won't be an issue for Consul 1.11.x
and up as legacy mode was completely removed and the server will start directly in the new mode.