Introduction
This article walks you through the general troubleshooting steps for Cross-partition Networking on Consul running in Kubernetes.
Expected results
That the services in different Consul Admin Partitions, within the datacenter, can discover and communicate with one another.
Troubleshooting
The below example uses the following:
- 2 Kubernetes clusters
foo
andbar
, each withMesh Gateway
installed - Upstream admin partition:
bar
- Upstream app:
backend
- Downstream admin partition:
foo
- Downstream app:
frontend
- ACL-enabled cluster
Verify the Exported Services configuration entry
The exported-services
configuration entry enables Consul to export service instances from one Admin Partition to others.
- In the
upstream
partition, verify both of theupstream
andMesh Gateway
services are exported to thedowstream
partition.
consul config read -kind exported-services -name bar -partition bar
{
"Kind": "exported-services",
"Name": "bar",
"Partition": "bar",
"Services": [
{
"Name": "mesh-gateway",
"Namespace": "default",
"Consumers": [
{
"Partition": "foo"
}
]
},
{
"Name": "backend",
"Namespace": "default",
"Consumers": [
{
"Partition": "foo"
}
]
}
],
"Meta": {
"consul.hashicorp.com/source-datacenter": "dc1",
"external-source": "kubernetes"
},
"CreateIndex": 142,
"ModifyIndex": 142
}
- Verify Envoy clusters of Mesh Gateway pod in
downstream
containsupstream
Mesh Gateway entry.
<ip>:19000/clusters
bar.dc1.internal-v1.bb6caad4-f146-7aaa-7668-199c94206c87.consul::<load_balancer_ip>::cx_active::0
bar.dc1.internal-v1.bb6caad4-f146-7aaa-7668-199c94206c87.consul::<load_balancer_ip>::cx_connect_fail::0
bar.dc1.internal-v1.bb6caad4-f146-7aaa-7668-199c94206c87.consul::<load_balancer_ip>::cx_total::19
Without this config, the
downstream
Mesh Gateway clusters will not have an entry of theupstream
Mesh Gateway
Verify the Service Intentions configuration
- Verify there is an intention to allow
downstream
to talk toupstream
service.
consul config read -kind service-intentions -name backend -partition bar
{
"Kind": "service-intentions",
"Name": "backend",
"Partition": "bar",
"Namespace": "default",
"Sources": [
{
"Name": "frontend",
"Partition": "foo",
"Namespace": "default",
"Action": "allow",
"Precedence": 9,
"Type": "consul"
}
],
"Meta": {
"consul.hashicorp.com/source-datacenter": "dc1",
"external-source": "kubernetes"
},
"CreateIndex": 144,
"ModifyIndex": 144
}
- Verify that the Envoy clusters of the
downstream
pod contains anupstream
entry.
<ip>:19000/clusters
backend.default.bar.dc1.internal-v1.bb6caad4-f146-7aaa-7668-199c94206c87.consul::10.42.0.16:20000::cx_active::0
Without this config,
downstream
Envoy clusters will have no entry ofupstream
service
Verify the Proxy Defaults configuration
- Verify that the
proxy-defaults
configuration of thedonwstream
pod includes theMeshGateway
parameter with themode
set tolocal
.
consul config read -kind proxy-defaults -name global -partition foo
{
"Kind": "proxy-defaults",
"Name": "global",
"Partition": "foo",
"Namespace": "default",
"TransparentProxy": {},
"MeshGateway": {
"Mode": "local"
},
"Expose": {},
"AccessLogs": {},
"Meta": {
"consul.hashicorp.com/source-datacenter": "dc1",
"external-source": "kubernetes"
},
"CreateIndex": 271,
"ModifyIndex": 271
}
- Verify the Envoy cluster
upstream
service is updated with the address of the localconsul-mesh-gateway
pod.
<ip>:19000/clusters
c225dc1c~backend.default.bar.dc1.internal-v1.bb6caad4-f146-7aaa-7668-199c94206c87.consul::10.42.0.17:8443::hostname::
Without this config, the endpoint to upstream service will contains local Pod IP of upstream app and will be unreachable from downstream, as it is 2 separated Kubernetes cluster.
-
- Before
mode=local
- Before
-
- After
mode=local
- After
Verify the DNS entry
- Verify that the correct DNS name is used in the
downstream
app in order for the Virtual IP Service lookup to occur.- The correct format is
[<tag>.]<service>.service[.<namespace>.ns][.<partition>.ap][.<datacenter>.dc]<domain>
E.g.
backend.virtual.default.ns.bar.ap.consul
Without correct DNS resolution, the upstream service is unreachable as Consul Transparent Proxy has no hint to route the request to. Transparent Proxy makes routing decision based on the service virtual tagged address. For Transparent Proxy disabled, you must manually configure explicit upstream annotation follow the same DNS format and dialling the service at localhost:<port>
- Verify that the Virtual IP address is returned by doing an exec into
downstream
pod
kubectl exec -it frontend-86594c7c7f-qctpc -- sh
/ # nslookup backend.virtual.default.ns.bar.ap.consul
Server: 127.0.0.1
Address: 127.0.0.1:53
Name: backend.virtual.default.ns.bar.ap.consul
Address: 240.0.0.1
Next steps
If the issue persists after following the above setup and troubleshooting, please contact HashiCorp support and provide us with all of the troubleshooting data you have collected, as well as the following additional information:
- Consul Server version
- Consul Helm Chart version
- Consul Helm values file