Introduction:
Generally, organizations run workloads in different regions. When operating a Vault cluster in a single region you observed increased latency from remote workloads causing delays and performance issues.
Vault Enterprise Performance Replication replicates data to Vault clusters in other regions allowing workloads in those regions to authenticate and consume Vault with reduced latency compared to accessing a Vault over long distances.
Moreover, the Performance Replication cluster is capable of balancing the 'read' requests from its local database. However, for 'writes' it is dependent on its Primary Performance Replication cluster hence it's important to maintain a reduced latency between the replicated clusters as well.
Vault performance replication starts with a primary cluster, and replicates to one or more secondary clusters.
In this guide, we would be configuring Vault replication between two clusters, however, these clusters would be leveraging the Consul cluster running alongside each cluster to provide a Service Discovery feature to provide DNS resolution during communication between these clusters.
Reproduction Steps:
Prerequisites
Setup for VMs on AWS EC2:-
-
Create two EC2 instances across different VPCs
-
Create VPC peering connections across VPCs
-
Ensure proper route table rules & SG rules are in place.
-
Verify that you can connect to other VPC nodes from the node in another VPC.
Steps to follow on the first VPC node:-
On the first node (let’s primary), run the consul agent.
consul agent -server -ui -node=server-1 -bootstrap-expect=1 -bind=172.31.27.167 -data-dir=/consul/data/ &
Create a payload.json file to register the Vault service on the consul.
{
"ID": "vault-pri",
"Name": "vault-pri",
"Address": "172.31.27.167",
"Port": 8200,
"Weights": {
"Passing": 10,
"Warning": 1
}
}
Run the following command while passing payload.json to register the service to the consul catalog.
curl --request PUT --data @payload.json http://127.0.0.1:8500/v1/agent/service/register?replace-existing-checks=true
root@cluster-pri:/home/ubuntu/learn-vault-lab# consul catalog services
consul
vault-pri
Follow this guide to set up DNS forwarding using 'systemd-resolved' to redirect any request with the '.consul' domain to the consul DNS server.
Once configured, try to check the DNS resolution for the 'vault-pri' service.
root@cluster-pri:/home/ubuntu/learn-vault-lab# dig @127.0.0.1 -p 8600 vault-pri.service.consul
; <<>> DiG 9.18.28-0ubuntu0.24.04.1-Ubuntu <<>> @127.0.0.1 -p 8600 vault-pri.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9778
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;vault-pri.service.consul. IN A
;; ANSWER SECTION:
vault-pri.service.consul. 0 IN A 172.31.27.167
;; Query time: 0 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1) (UDP)
;; WHEN: Thu Sep 12 12:17:22 UTC 2024
;; MSG SIZE rcvd: 69
Create the following server configuration file, and name it 'vault-server.hcl':
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "raft" {
path = "/vault/file"
}
Run the Vault agent as a foreground process.
vault server -config=cluster-pri/config/vault-server.hcl &
Note:- Ensure to set all required environment variables (like VAULT_ADDR, VAULT_CLUSTER_ADDR, VAULT_API_ADDR, VAULT_RAFT_NODE_ID, VAULT_LICENSE, VAULT_TOKEN) as suggested in the link
Follow the steps as suggested in the link to set up the Vault primary cluster to initialize and unseal the Vault primary node.
Then, follow the instructions suggested in the link to enable performance replication for the primary node and fetch the secondary_bootstrap token.
Steps to follow on the second VPC node:-
Follow the same steps as configured on the first VPC node, and ensure that the Vault service name should be renamed to something, let’s say 'vault-sec'.
Once the service registration and DNS configuration part is done, then ensure DNS resolution is working fine like below.
root@cluster-sec:/home/ubuntu/learn-vault-lab# dig @127.0.0.1 -p 8600 vault-sec.service.consul
; <<>> DiG 9.18.28-0ubuntu0.24.04.1-Ubuntu <<>> @127.0.0.1 -p 8600 vault-sec.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43257
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;vault-sec.service.consul. IN A
;; ANSWER SECTION:
vault-sec.service.consul. 0 IN A 10.0.4.157
;; Query time: 1 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1) (UDP)
;; WHEN: Fri Sep 13 11:29:26 UTC 2024
;; MSG SIZE rcvd: 69
Create the following server configuration file, and name it 'vault-server.hcl':
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "raft" {
path = "/vault/file"
}
Run the vault agent as a foreground process.
vault server -config=cluster-sec/config/vault-server.hcl &
Note:- Ensure to set all required environment variables (like VAULT_ADDR, VAULT_CLUSTER_ADDR, VAULT_API_ADDR, VAULT_RAFT_NODE_ID, VAULT_LICENSE, VAULT_TOKEN) as suggested in the link
Follow the steps as suggested in the link to set up the Vault secondary cluster to initialize and unseal the Vault secondary node.
Then, follow the instructions suggested in the link to enable performance replication for the secondary node, by setting the 'SEC_BOOTSTRAP' environment variable with the bootstrap_token fetched from the primary cluster.
Once, done you will see the output below.
root@cluster-sec:/home/ubuntu/learn-vault-lab# vault write -address=$SEC_ADDR sys/replication/performance/secondary/enable token=$SEC_BOOTSTRAP
WARNING! The following warnings were returned from Vault:
* Vault has successfully found secondary information; it may take a while to
perform setup tasks. Vault will be unavailable until these tasks and initial
sync complete.
References:
Configuration of the primary cluster