This article is a quick start guide with tips on using the Command Line Interface (CLI): jq
to perform brief analysis and selections on Vault Audit log files.
It's recommended to use the most recent versions of jq
that's 1.6 or higher and in conjunction with other useful documentation around that to extend these examples accordingly in your own aliases or scripts.
Some useful jq
links & documents include:
- Vault tutorials: Query audit device logs
- jq official docs: jq Manual (development version)
- Article: Reshaping JSON with jq
The download & installation instruction of jq
for all operating systems (Linux, macOS, Windows, etc) may be found on their official website: Download jq.
Examples in this guide make reference to the variable VAFILE
that denotes the path of a Vault Audit log file and it may also be appended with additional files like:
VAFILE=… ;
jq … $VAFILE # // single file
VAFILE1=… ; VAFILE2=… ;
jq … $VAFILE1 $VAFILE2 # // …… multiple
The CLI arguments passed to jq
vary and those commonly used in this guide are noted in the square brackets below:
jq [-R | --arg | -s | -c | -r | -n] '…QUERY…' …file(s)
Dates, Extracting Audits & Ranges
If the Audit logs are combined together with Vault operational logs or other system level events then they may be separated into different files with jq
using:
VLOG=/path/to/comibined_file.log
VAFILE=vault_audit.json
jq -Rc 'select(contains("]: {"))|sub(".*?]: {";"{")|fromjson' $VLOG > $VAFILE
Once separated the new VAFILE
can be parsed with jq
the same as if the provided Audits were originally of a file type.
In order to narrow the time span of audits to a particular range you can define the variable Date-Being & Date-End (DB
& DE
) in ISO 8601 format with the mid (T
) time-separator as well as ending Z
characters - like for example:
# // RESPONSE & REQUESTS IN A PARTICULAR DATE RANGE
DB=2023-07-18T14:34:23Z
DE=2023-07-18T14:34:24Z
jq --arg DB $DB --arg DE $DE -nc '[inputs|select(.time|sub("\\..*Z";"Z")|([.,$DB,$DE|fromdate]|sort)[1]==fromdate)]' $VFILE > $VAFILE_FILTERED
The provided event times in the Vault operational logs include fractional milliseconds which are not compatible with the date functions provided by jq
such as fromdate
; therefore a concatenation is performed in the sub
portion of the query above and elsewhere dealing with dates throughout this guide.
Rates, Frequency & Totals
The total number of request entries can be used as measure of activity for the entire timespan of the Audit logs.
# // TOTAL NUMBER OF REQUEST USING REQUESTS AS COUNTER
jq -s 'map(select(.type=="request"))|length' $VAFILE
A brief summary of the Audit log span can be extrapolated by performing:
# // START & END DATES AS WELL AS BASIC RATES
jq -s '(del(.[]|select(.type!="request"))|length) as $L|sort_by(.time)|(.[-1].time|sub("\\..*Z";"Z")|fromdate) as $d1|(.[0].time|sub("\\..*Z";"Z")|fromdate) as $d2|($d1-$d2) as $d3|{time_start: .[0].time, time_ended: .[-1].time, span_seconds: $d3, span_minutes: ($d3/60), span_hours: ($d3/3600), span_days: ($d3/93600), requests: $L, qps_average: ($L/$d3)}' $VAFILE
You can use the example below to calculate the rate of Queries Per-Second (QPS) using the logged times for both response & request fields that match by their request_id
.
# // QPS IN CHRONOLOGICAL ORDER OF AUDITS IN SECONDS (GROUPED)
jq -sr 'map(select(.type=="request").time|sub("\\..*Z";"Z"))|group_by(.)[]|"\(first): \(length)"' $VAFILE
# // QPS LOW TO HIGH SORTED BY BUSIEST SECONDS (GROUPED)
jq -sr 'map(select(.type=="request").time|sub("\\..*Z";"Z"))|group_by(.)|sort_by(length)[]|"\(first): \(length)"' $VAFILE
Other useful totals & frequency examples below make use of the optional CLI sed
& column
tools to format tabular summaries of interest such as:
# // SUMMARY OF OPERATION TYPES
jq -sr 'map(select(.type=="request")|{type: .request.operation})|group_by(.type)|sort_by(length)[]|"\(first): \(length)"' $VAFILE|sed 's/\}:/\}+/'|column -ts'+'
# // TOTAL NUMBER OF TOKEN TYPES USED
jq -sr 'map(select(.type=="request")|{token_type: .auth.token_type, type: .request.operation})|group_by(.token_type)| sort_by(length)[]| "\(first): \(length)"' $VAFILE | sed 's/,/,+/g'|sed 's/\}:/\}+/'|column -ts'+'
# // BUSIEST PATHS & THEIR TYPES
jq -sr 'map(select(.type=="request")|{mtype: .request.mount_type, path: .request.path})|group_by(.path)|sort_by(length)[]|"\(first): \(length)"' $VAFILE|sed 's/,/,+/g'|sed 's/\}:/\}+/'|column -ts'+'
# // BUSIEST OPERATION TYPES BY PATHS
jq -sr 'map(select(.type=="request")|{mtype: .request.mount_type, path: .request.path, type: .request.operation})|group_by(.path)|sort_by(length)[]|"\(first): \(length)"' $VAFILE|sed 's/,/,+/g'|sed 's/\}:/\}+/'|column -ts'+'
# // BUSIEST CLIENTS / REQUEST PATHS BY CLIENT_DISPLAY_NAME CDN
jq -sr 'map(select(.type=="request")|{path: .request.path, cdn: .auth.display_name})|group_by(.cdn)|sort_by(length)[]|"\(first): \(length)"' $VAFILE|sed 's/,/,+/g'|sed 's/\}:/\}+/'|column -ts'+'
# // SPECIFIC PATHS '/ui/' & 'XYZ' & WHO'S CALLING THEM?
jq -sr 'map(select(.type=="request")|select(.request.path|contains("/ui/","XYZ"))|{cdn: .auth.display_name, path: .request.path})|group_by(.path)|sort_by(length)[]|"\(first): \(length)"' $VAFILE|sed 's/,/,+/g'|sed 's/\}:/\}+/'|column -ts'+'
Time calculations, Error Responses & Other Selectors
It's possible to calculate the estimated duration of a requests that were responded to using:
# // DURATION OF REQUESTS (RESPONSE + REQUEST PAIRS WITH MATCHING ID), LOW TO HIGH
jq -sc '[group_by(.request.id)|map(select(length==2))|.[]|((.[1].time|sub("\\..*Z";"Z")|fromdate)-(.[0].time|sub("\\..*Z";"Z")|fromdate)) as $delta|{duration: $delta, rid: .[0].request.id, time: .[0].time}]|sort_by(.duration)[]' $VAFILE
Getting average response times as well as minimum and maximum duration for all request & response pairs is also possible with the following query:
# // FASTEST, SLOWEST & AVERAGE RESPONSE TIMES
jq -s '[group_by(.request.id)|map(select(length==2))| .[] | ((.[1].time|sub("\\..*Z"; "Z")|fromdate) - (.[0].time|sub("\\..*Z"; "Z")|fromdate)) as $delta|{duration: $delta}]|sort_by(.duration)|length as $L|.[0].duration as $d1|.[-1].duration as $d2|([.[].duration]|add) as $d3|[select(.[].duration==0)]|length as $L0|{rtime_fastest: $d1, rtime_slowest: $d2, rtime_average: ((($L0*20)+$d3)/$L)}' $VAFILE
To list only responses containing an error and the types of authorised users that invoked them, do:
# // ERROR RESPONSES & REQUEST PATHS
jq -s 'map(select(.type=="response")|select(.error!=null)|{error: .error, path: .request.path, cdn: .auth.display_name})' $VAFILE
To scope ranges to request & response pairs by a specific path of interest execute a query similar to:
# // REQUEST & RESPONSE PAIRS FOR PARTICULAR PATH 'pki/roles/example9':
jq -sr '[group_by(.request.id)|map(select(length==2))|.[]|select(.[1].request.path|contains("pki/roles/example9"))]|.[]' $VAFILE
Resources
- Vault tutorials: Query audit device logs
- jq official: jq Manual (development version)
- Article: Reshaping JSON with jq
- Vault Support KB: Audit Device Notes