Introduction
This guide provides a clear path for upgrading your multi-datacenter Consul deployment that uses Legacy Access Control Lists (ACLs) on versions before 1.4.0.
Due to updates in the ACL system, ensure your Consul version is at least 1.2.4 before upgrading to the latest 1.6.x release. This is crucial because the 1.6.x series is the final version with support for legacy ACL tokens, which are necessary for the migration process to newer Consul versions.
Expected Outcome
Users can ensure a smooth and successful upgrade process while maintaining the security and integrity of their Consul deployment.
Prerequisites
- A multi-datacenter Consul setup (minimum two datacenters) with ACL replication enabled
- All servers running Consul 1.2.4
While this document focuses on ACL changes, be aware of two other important updates in the below 1.6.x releases:
- JSON validation: Ensure your service definitions and configurations use valid JSON, as stricter decoding in 1.6.2 may now result in errors like
Connect: null
in service definitions. (See GH#6680)- Connection management: Version 1.6.3 introduces a default limit of 200 connections per client. If your applications require more connections, adjust the
http_max_conns_per_client
settings accordingly. (See GH#7159)
Use Case
This article is essential for Consul users who:
- Manage multi-datacenter Consul deployments
- Are using older versions of Consul with legacy ACLs (pre-1.4.0)
- Want to upgrade to newer versions of Consul to benefit from the latest features and security enhancements
Procedure
We will illustrate the migration process with a three-datacenter setup, consisting of DC1 (our primary datacenter), DC2, and DC3.
Important: Upgrade your primary datacenter (defined by
acl_datacenter
) last. Upgrading it first disrupts ACL replication. When secondary datacenters are upgraded first, they continue to recognize legacy ACLs replicated from the primary datacenter.
- Verify ACL replication in your primary datacenter (DC1) by running the following command from a Consul server within that datacenter
-
curl -s -H 'X-Consul-Token: $MASTER_TOKEN' localhost:8500/v1/acl/replication | jq
- You should see output similar to this
{ "Enabled": false, "Running": false, "SourceDatacenter": "", "ReplicatedIndex": 0, "LastSuccess": "0001-01-01T00:00:00Z", "LastError": "0001-01-01T00:00:00Z" }
-
- Verify ACL replication in your secondary datacenter (DC2) by running the following command from a Consul server within that datacenter
-
curl -s -H 'X-Consul-Token: $MASTER_TOKEN' localhost:8500/v1/acl/replication | jq
- You should see output similar to this
{ "Enabled": true, "Running": true, "SourceDatacenter": "dc1", "ReplicatedIndex": 24, "LastSuccess": "2020-09-08T15:09:05Z", "LastError": "0001-01-01T00:00:00Z" }
-
- Upgrade the Consul agents in DC2 and DC3 to the latest 1.6.x release. Keep the DC1 agents at version 1.2.4 for now. After upgrading DC2 and DC3, you should see log messages similar to the following:
2020/09/08 15:51:29 [DEBUG] acl: Cannot upgrade to new ACLs, servers in acl datacenter have not upgraded - found servers: true, mode: 3
- Verify ACL replication is still functioning correctly in DC3
- From a Consul server in DC3
curl -s -H 'X-Consul-Token: $MASTER_TOKEN' localhost:8500/v1/acl/replication | jq curl -H "X-Consul-Token: $MASTER_TOKEN" localhost:8500/v1/acl/list | jq
- From a Consul server in DC1
curl -X PUT -H "X-Consul-Token: $MASTER_TOKEN" -d @/policies/ui-policy.json localhost:8500/v1/acl/create
- From a Consul server in DC3
curl -s -H 'X-Consul-Token: $MASTER_TOKEN' localhost:8500/v1/acl/replication | jq curl -H "X-Consul-Token: $MASTER_TOKEN" localhost:8500/v1/acl/list | jq
- After creating the new token in DC1, the
ReplicatedIndex
value in DC3 should increase. If you attempt to use ACL commands from the Consul CLI, you'll encounter an error like this:
Failed to retrieve the token list: Unexpected response code: 500 (The ACL system is currently in legacy mode.)
This error occurs because DC3 is currently in legacy ACL mode, where the Consul CLI commands are not fully supported. For now, continue using the HTTP API endpoints for ACL management, as demonstrated in the
curl
examples above.
- After creating the new token in DC1, the
- From a Consul server in DC3
-
Upgrade the Consul agents in DC1 to the latest 1.6.x release
-
Before proceeding, it's essential to validate the health and stability of your Consul cluster. You can do this by:
-
Checking cluster membership: Use the
consul members
command to ensure all servers are correctly joined and in a healthy state. -
Examining Raft peer status: Run
consul operator raft list-peers
to confirm that the Raft consensus protocol is functioning correctly and all servers are participating in the quorum. - Monitoring logs: Keep an eye on the Consul server logs for any unusual errors or warnings that might indicate issues.
-
Checking cluster membership: Use the
-
Migrate your legacy ACL tokens to the new system by following the instructions in our Migrate Legacy ACL Tokens tutorial
Post-Upgrade Configuration Changes
As part of the ACL system enhancements, several configuration options have been renamed. To avoid any issues, make the following adjustments to your Consul server configurations before upgrading beyond the 1.6.x series:
-
acl_datacenter
is nowprimary_datacenter
(see primary_datacenter for more info) -
acl_*_token
options are now specified like this (see tokens for more info):
Note:tokens { master = "..." agent = "..." agent_master = "..." replication = "..." default = "..." }
master
Renamed in Consul 1.11 toacl.tokens.initial_management
andagent_master
was renamed in Consul 1.11 toacl.tokens.agent_recovery
-
acl_default_policy
,acl_down_policy
,acl_ttl
, andenable_acl_replication
options are now specified like this (see acl for more info):
acl { enabled = true/false default_policy = "..." down_policy = "..." policy_ttl = "..." role_ttl = "..." enable_token_replication = true/false enable_token_persistence = true/false }
Ensure your configuration changes are correct by running the following command:
consul validate $CONFIG_FILE_PATH
Additional Information
- Migrate Legacy ACL Tokens - This tutorial guides you through migrating legacy ACL tokens in Consul, ensuring your access control policies remain effective after upgrading to versions 1.4.0 and later. It outlines different migration methods and provides helpful scripts to automate the process.