Tuning Envoy Proxy Concurrency in Nomad Deployments – HashiCorp Help Center

Envoy's concurrency setting (--concurrency) determines the number of worker threads used to handle incoming connections. Since Envoy scales well due to its threading model, proper tuning is necessary to optimize resource usage while avoiding unnecessary memory consumption or suboptimal request distribution.

Sidecar Proxy Configuration

(For workload-mesh use cases where Envoy is deployed as a sidecar proxy next to application services)

Initial Default Value

The default value is 1 (--concurrency 1), which is generally a good starting point for lightweight services that have relatively low request volumes.
Why? Because:
- Sidecar proxies often sit next to applications that may be CPU/memory constrained.
- Higher concurrency does not necessarily improve performance unless the application itself can utilize more CPU.

Scaling Considerations

Primary Factors:

1. Application Threading Model: If the application is single-threaded (e.g., Python Flask), there is no benefit in running multiple Envoy threads. However, for multi-threaded services (e.g., Go, Java), additional concurrency may be beneficial.
2. Traffic Volume: If the service handles a large volume of inbound/outbound connections, increasing concurrency can help distribute the load.
3. CPU Allocation: Each Envoy worker thread is pinned to a CPU thread. If a sidecar has only 250MHz CPU allocated (as per the Nomad example), increasing concurrency may not yield performance benefits.

Recommended Tuning Strategy

Low traffic applications: Stick with the default (--concurrency 1).
Moderate traffic (e.g., 100-500 RPS per instance): Scale concurrency to 2-4, but only if the CPU allocation is at least 500-1000 MHz.
High traffic (e.g., 500+ RPS per instance): Consider increasing to 4-8, ensuring sufficient CPU (>1 core).

Additional Optimization Notes

CPU Pinning: If running Nomad with cgroups-based resource isolation, ensure that Envoy is allocated enough CPU cores to justify increasing concurrency.
Memory Usage Consideration: More concurrency means more memory usage per worker for connection pools. Be mindful if sidecar memory is limited (128MB in your Nomad config).
Connection Pool Efficiency: Too many workers can result in fragmented connection pools, leading to reduced reuse of HTTP/2 or TCP connections.

TL;DR for Sidecars

Default (--concurrency 1) is fine for most workloads.
Scale based on CPU/memory allocation and service threading model.
If your sidecar is handling 500+ RPS and has sufficient CPU, increase to 2-4 or more.

Edge Proxy Configuration (Ingress/Egress)

(For Envoy acting as a gateway, e.g., NGINX ingress replacement or terminating gateway egress proxy)

Initial Default Value

Unlike sidecars, edge proxies require much higher concurrency due to their role in handling multiple independent connections across multiple backend services.
Good initial default: Set --concurrency equal to the number of CPU cores allocated (NOMAD_CPU_LIMIT / 1000).

Explanation

NOMAD_CPU_LIMIT gives the allocated CPU in MHz.
Since Envoy's concurrency is typically set to the number of CPU cores, we divide by 1000 to get the number of vCPUs.

Example Usage in Nomad Task Configuration

If you want to dynamically set concurrency based on available CPU, modify your args field in Nomad. You'll need to update the Default Envoy Configuration within the sidecar_task block of each application:

sidecar_task {
  args = [
    "-c",
    "${NOMAD_SECRETS_DIR}/envoy_bootstrap.json",
    "-l",
    "${meta.connect.log_level}",
    "--concurrency",
    "$(($((${NOMAD_CPU_LIMIT} + 999)) / 1000))", 
    "--disable-hot-restart" 
  ]
}

To ensure a minimum concurrency of 1, use:

--concurrency=$(($((${NOMAD_CPU_LIMIT} + 999)) / 1000))

This ensures that even with NOMAD_CPU_LIMIT=500, concurrency will not be 0, but at least 1.

Explanations

If Nomad allocates 500 MHz (NOMAD_CPU_LIMIT=500), then --concurrency will be 0 (not ideal).
If Nomad allocates 2000 MHz (NOMAD_CPU_LIMIT=2000), then --concurrency will be 2.

Scaling Considerations

Primary Factors:

1. CPU Allocation: Since each worker thread runs its own event loop, a good rule of thumb is 1 worker per vCPU core. Setting concurrency higher than CPU cores often wastes memory.
2. Throughput Needs: If the proxy needs to handle thousands of RPS, concurrency should match the expected workload.
3. Connection Characteristics: If many long-lived connections exist (e.g., gRPC or WebSockets), you may want slightly higher concurrency.

Recommended Tuning Strategy

1 vCPU → Set concurrency to 1.
2 vCPUs → Set concurrency to 2.
4 vCPUs → Set concurrency to 4.
8+ vCPUs → Consider keeping concurrency at 8, unless traffic volume justifies more.

Additional Optimization Notes

Autoscaling: If deploying in Nomad, consider auto-scaling based on request latency, CPU load, and number of active connections.
Performance Bottlenecks: If CPU usage is consistently high, increasing concurrency will not help. Instead, optimize filters, connection pooling, and avoid blocking filters.
TLS Handshakes: Edge proxies often terminate TLS, which is CPU-intensive. Ensure CPU is appropriately allocated if handling high numbers of HTTPS requests.

TL;DR for Edge Proxies

Set concurrency to match CPU cores (e.g., 4 vCPUs → --concurrency 4).
Too high concurrency wastes memory and connection pool efficiency.
Use Nomad autoscaling for dynamic concurrency tuning.

Summary of Recommendations

Proxy Type	Default (`--concurrency`)	When to Scale Up?
Sidecar Proxy	`1`	If handling 500+ RPS AND CPU allocation is at least 500+ MHz. Scale to `2-4` based on load.
Edge Proxy	Match CPU cores	If handling high sustained traffic, optimize based on CPU/memory/latency.

References

Medium: Envoy Proxy Threading Model
Envoy Docs: Envoy Listener Theading Model

Sidecar Proxy Configuration

Initial Default Value

Scaling Considerations

Recommended Tuning Strategy

Additional Optimization Notes

TL;DR for Sidecars

Edge Proxy Configuration (Ingress/Egress)

Initial Default Value

Explanation

Example Usage in Nomad Task Configuration

Explanations

Scaling Considerations

Recommended Tuning Strategy

Additional Optimization Notes

TL;DR for Edge Proxies

Summary of Recommendations

References

Articles in this section

Related articles