Introduction
Nomad Autoscaler is a horizontal application and cluster autoscaler for Nomad. Additionally, the enterprise version of Nomad Autoscaler supports Dynamic Application Sizing (DAS). While Nomad Autoscaler and Nomad work hand in hand, the Nomad Autoscaler is built and released separately to Nomad. Nomad Autoscaler supports several APMs for horizontal scaling; however, currently, Prometheus must be the APM used for Dynamic Application Sizing (DAS). Many companies, especially those using AWS, use Datadog as the default APM, so we will focus on Datadog and Prometheus. This article covers how one can configure Nomad Autoscaler to utilize two APMs for horizontal scaling and DAS.
Prerequisites (if applicable)
- Nomad & Consul cluster currently running (Nomad Enterprise (with the Multi-Cluster & Efficiency module) is needed for DAS)
- Datadog agent installed on each Nomad node
Overview
At a high-level the setup process includes:
- Adding telemetry configuration for nomad
- Deploy Prometheus
- Deploy Nomad Autoscaler (Nomad Autoscaler Enterprise is needed for DAS)
- Deploy load test apps for validation testing
This article utilizes Nomad to deploy both Prometheus and the Nomad Autoscaler as apps within the cluster. While this is not a hard requirement, it is recommended to deploy Nomad Autoscaler as an app within the cluster.
Procedures
- Add telemetry configuration for nomad
- On each of the nomad nodes to be monitored, add the following telemetry stanza to your nomad.hcl file.telemetry {
- Restart Nomad for the change to take effect.
publish_allocation_metrics = true
publish_node_metrics = true
datadog_address = "localhost:8125"
collection_interval = "10s"
prometheus_metrics = true
} - Deploy Prometheus
- Deploy the Prometheus job below
--- prometheus.nomad ---
job "prometheus" {
datacenters = ["dc1"]
group "prometheus" {
count = 1
network {
port "prometheus_ui" {
static = 9090
to = 9090
}
}
task "prometheus" {
driver = "docker"
config {
image = "prom/prometheus:v2.25.0"
ports = ["prometheus_ui"]
args = [
"--config.file=/etc/prometheus/config/prometheus.yml",
"--storage.tsdb.path=/prometheus",
"--web.console.libraries=/usr/share/prometheus/console_libraries",
"--web.console.templates=/usr/share/prometheus/consoles",
]
volumes = [
"local/config:/etc/prometheus/config",
]
}
template {
data = <<EOH
---
global:
scrape_interval: 1s
evaluation_interval: 1s
scrape_configs:
- job_name: nomad
metrics_path: /v1/metrics
params:
format: ['prometheus']
static_configs:
- targets: ['{{ env "attr.unique.network.ip-address" }}:4646']
- job_name: consul
metrics_path: /v1/agent/metrics
params:
format: ['prometheus']
static_configs:
- targets: ['{{ env "attr.unique.network.ip-address" }}:8500']
EOH
change_mode = "signal"
change_signal = "SIGHUP"
destination = "local/config/prometheus.yml"
}
resources {
cpu = 500
memory = 512
}
service {
name = "prometheus"
port = "prometheus_ui"
check {
type = "http"
path = "/-/healthy"
interval = "10s"
timeout = "2s"
}
}
}
}
} - Deploy Nomad Autoscaler
- Update the Autoscaler job below with your dd_api_key and dd_app_key (if needed, update the DD site as well)
- Deploy the Autoscaler job
--- autoscaler.nomad ---
job "autoscaler" {
datacenters = ["dc1"]
group "autoscaler" {
count = 1
network {
port "http" {}
}
task "autoscaler" {
driver = "docker"
config {
image = "hashicorp/nomad-autoscaler-enterprise:0.3.6"
command = "nomad-autoscaler"
ports = ["http"]
args = [
"agent",
"-config",
"local/autoscaler.hcl",
"-http-bind-address",
"0.0.0.0",
"-http-bind-port",
"${NOMAD_PORT_http}",
]
}
template {
destination = "local/autoscaler.hcl"
data = <<EOH
nomad {
// Use Consul service discovery for the Nomad client IP and Port.
address = "{{ with service "nomad-client" }}{{ with index . 0 }}http://{{.Address}}:{{.Port}}{{ end }}{{ end }}"
// Use the slapt operator so the autoscaler monitors scaling policies from
// all Nomad namespaces. If you wish to have it only monitor a single
// namespace, update this param to match the desired name.
namespace = "*"
// If Nomad ALCs are in use, the following line should be uncommented and
// updated to include an ACL token.
// token = ""
}
// Setup the Prometheus APM so that the autoscaler can pull historical and
// point-in-time metrics regarding task resource usage.
apm "prometheus" {
driver = "prometheus"
config = {
// Use Consul service discovery for the Prometheus IP and Port.
address = "{{ with service "prometheus" }}{{ with index . 0 }}http://{{.Address}}:{{.Port}}{{ end }}{{ end }}"
// If you wish to use this example without Consul, you can remove the above
// Consul service discovery address section and replace it with the line
// below uncommented with an updated endpoint if required.
// address = "http://localhost:9090"
}
}
// This block contains configuration options specific to the Dynamic Application
// Sizing enterprise feature.
dynamic_application_sizing {
// Lower the evaluate interval so we can reproduce recommendations after only
// 5 minutes, rather than having to wait for 24hrs as is the default.
evaluate_after = "5m"
}
apm "datadog" {
config = {
dd_api_key = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
dd_app_key = "YYYYYYYYYYYYYYYYYYYYYYYYYYYYY"
site = "datadoghq.com"
}
driver = "datadog"
}
policy_eval {
// Disable the horizontal cluster workers. This
// helps reduce log noise during the demo.
workers = {
cluster = 0
// horizontal = 0
}
}
policy {
default_evaluation_interval = "10s"
default_cooldown = "1m"
}
strategy "fixed-value" {
driver = "fixed-value"
}
strategy "pass-through" {
driver = "pass-through"
}
strategy "target-value" {
driver = "target-value"
}
strategy "threshold" {
driver = "threshold"
}
strategy "app-sizing-avg" {
driver = "app-sizing-avg"
}
strategy "app-sizing-percentile" {
driver = "app-sizing-percentile"
}
EOH
}
resources {
cpu = 512
memory = 512
}
service {
name = "autoscaler"
port = "http"
check {
type = "http"
path = "/v1/health"
interval = "5s"
timeout = "2s"
}
}
}
}
} - Deploy load test apps for validation testing
- There are two jobs used for testing.
- The first job example.nomad contains two task groups:
1. cache-lb: This task group is scaled by Dynamic Application Sizing.
2. cache: This task group is scaled by Horizontal Autoscaling.
- The second job loadtest.nomad is a parametrized job that can be executed at any time, causing load to be generated against the example.nomad tasks.
--- example.nomad ---
job "example_prom_dd" {
datacenters = ["dc1"]
group "cache-lb" {
count = 1
network {
port "lb" {}
}
task "nginx" {
driver = "docker"
config {
image = "nginx"
ports = ["lb"]
volumes = [
# It's safe to mount this path as a file because it won't re-render.
"local/nginx.conf:/etc/nginx/nginx.conf",
# This path hosts files that will re-render with Consul Template.
"local/nginx:/etc/nginx/conf.d"
]
}
# This template overwrites the embedded nginx.conf file so it loads
# conf.d/*.conf files outside of the `http` block.
template {
data = <<EOF
user nginx;
worker_processes 1;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
include /etc/nginx/conf.d/*.conf;
EOF
destination = "local/nginx.conf"
}
# This template creates a TCP proxy to Redis.
template {
data = <<EOF
stream {
server {
listen {{ env "NOMAD_PORT_lb" }};
proxy_pass backend;
}
upstream backend {
{{ range service "redis" }}
server {{ .Address }}:{{ .Port }};
{{ else }}server 127.0.0.1:65535; # force a 502
{{ end }}
}
}
EOF
destination = "local/nginx/nginx.conf"
change_mode = "signal"
change_signal = "SIGHUP"
}
resources {
cpu = 100
memory = 20
}
scaling "cpu" {
policy {
cooldown = "1m"
evaluation_interval = "1m"
check "95pct" {
source = "prometheus"
strategy "app-sizing-percentile" {
percentile = "95"
}
}
}
}
scaling "mem" {
policy {
cooldown = "1m"
evaluation_interval = "1m"
check "max" {
source = "prometheus"
strategy "app-sizing-max" {}
}
}
}
}
service {
name = "redis-lb"
port = "lb"
address_mode = "host"
check {
type = "tcp"
port = "lb"
interval = "10s"
timeout = "2s"
}
}
}
group "cache" {
count = 3
scaling {
min = 3
max = 6
enabled = true
policy {
evaluation_interval = "30s"
cooldown = "1m"
check "CPU_400" {
source = "datadog"
query = "nomad.client.allocs.cpu.total_ticks{task:redis}"
strategy "target-value" {
target = 400
}
}
}
}
network {
port "db" {
to = 6379
}
}
task "redis" {
driver = "docker"
config {
image = "redis:6.0"
ports = ["db"]
}
resources {
cpu = 500
memory = 256
}
service {
name = "redis"
port = "db"
address_mode = "host"
check {
type = "tcp"
port = "db"
interval = "10s"
timeout = "2s"
}
}
}
}
}
--- loadtest.nomad ---
job "load-test" {
datacenters = ["dc1"]
type = "batch"
parameterized {
payload = "optional"
meta_optional = ["requests", "clients"]
}
group "redis-benchmark" {
task "redis-benchmark" {
driver = "docker"
config {
image = "redis:6.0"
command = "redis-benchmark"
args = [
"-h",
"${HOST}",
"-p",
"${PORT}",
"-n",
"${REQUESTS}",
"-c",
"${CLIENTS}",
]
}
template {
destination = "secrets/env.txt"
env = true
data = <<EOF
{{ with service "redis-lb" }}{{ with index . 0 -}}
HOST={{.Address}}
PORT={{.Port}}
{{- end }}{{ end }}
REQUESTS={{ or (env "NOMAD_META_requests") "100000" }}
CLIENTS={{ or (env "NOMAD_META_clients") "50" }}
EOF
}
resources {
cpu = 100
memory = 128
}
}
}
}
Nomad Clusters with ACLs Enabled
If you have enabled ACLs in your Nomad cluster, you'll need to add a token to the autoscaler job before deploying the autoscaler.
There are two token types available that can be used for the autoscaler; a management token with all permissions available or a client token which is more constrained when it comes to permissions.
If you plan to use a client token with the autoscaler, you'll need to ensure that it contains the minimum capabilities needed. The policy file below contains the minimum capabilities needed by the autoscaler.
namespace "*" {
policy = "read"
capabilities = ["read-job", "scale-job", "list-scaling-policies", "read-scaling-policy", "read-job-scaling", "submit-job"]
}
operator {
policy = "read"
}
node {
policy = "read"
}
plugin {
policy = "read"
}