Introduction
Consul utilizes TLS to secure communication between its server and client agents. This requires each agent to possess a key pair signed by a common Certificate Authority (CA). These certificates have a limited lifespan and must be rotated periodically to maintain security and cluster functionality. This document outlines the procedures for rotating Consul TLS certificates, covering both proactive and reactive scenarios.
Expected Outcome
Successful rotation of Consul TLS certificates without significant service interruption. This includes:
- All Consul servers and clients communicating securely using TLS with the new certificates.
- Minimal to zero downtime during the rotation process.
- A clear understanding of the steps involved in both proactive and reactive certificate rotations.
Prerequisites
Before starting a certificate rotation, ensure you have the following:
- CA Access: Access to the Certificate Authority (CA) used to generate certificates.
- Consul Installation: Consul installed and configured on all agents.
- Backup: A recent backup of your Consul data and configuration is strongly recommended.
- Consul Command Knowledge: Familiarity with Consul commands (e.g., consul reload, consul tls cert create).
- Certificate Distribution Method: A method for distributing new certificates to all agents (e.g., configuration management, scripts, or manual copy).
- Understanding of Auto-Encryption (if used): If using Consul's auto-encryption, understand how it affects client certificate rotation.
Use Cases
This procedure applies to two primary use cases:
- Proactive Rotation (Valid Certificates): Certificates are still valid but nearing expiration. This rotation aims to prevent future outages.
- Reactive Rotation (Expired Certificates): Certificates have already expired, causing cluster issues. This rotation aims to restore service.
Procedure
Use Case 1: Proactive Rotation (Valid Certificates)
This procedure minimizes downtime by rotating certificates before they expire.
The Update Consul agents to securely communicate with TLS tutorial provides the necessary steps to update your existing datacenter for production ready TLS with zero-downtime.
Use Case 2: Reactive Rotation (Expired Certificates)
In this scenario the TLS certificates have already expired and the Consul cluster is in degraded state.
Consul Clients Agents
To rotate certificates for Consul server agents complete the following steps:
- Generate New Certificates: Generate new certificates for all server agents.
- Distribute New Certificates: Distribute to server nodes.
- Rolling Restart of Consul Servers: A rolling restart is necessary. Restart one server at a time, ensuring a quorum of servers remains online.
Consul Clients Agents
To rotate certificates for Consul client agents using the operator method complete the following steps:
- Generate New Certificates: Generate new client certificates.
- Distribute New Certificates: Distribute to client nodes.
- Restart Consul Clients: Restart all client agents.
When using the auto-encryption method, Consul automatically rotates the client certificates without operator intervention.
Recommendation
Implement certificate expiration monitoring and alerting to avoid reactive rotations. Proactive rotation is always preferred to minimize downtime and potential disruptions.