# ____ ___ _
# | _ \ _ __ ___ _ __ ___ / _ \| |
# | |_) | '__/ _ \| '_ ` _ \| | | | |
# | __/| | | (_) | | | | | | |_| | |___
# |_| |_| \___/|_| |_| |_|\__\_\_____|
#
# ____ _ _ _ _
# / ___| |__ ___ __ _| |_ ___| |__ ___ ___| |_
# | | | '_ \ / _ \/ _` | __/ __| '_ \ / _ \/ _ \ __|
# | |___| | | | __/ (_| | |_\__ \ | | | __/ __/ |_
# \____|_| |_|\___|\__,_|\__|___/_| |_|\___|\___|\__|
Usage and examples of basics, aggregations & functions in PromQL (Prometheus query language)
Work in progress!
Consider metric node_filesystem_size_bytes from Node exporter, which reports the size of each of your mounted filesystems, and has device, fstype, and mountpoint labels.
→ Sums everything up with same labels, gets total sum of filesystem size of all machines being monitored:
sum(node_filesystem_size_bytes)
→ Sums everything up with same labels only taking in those in by, gets filesystem size of all devices on each machines:
sum by(instance, device)(node_filesystem_size_bytes)
→ Sums everything up with same lables ignoring those in without, gets filesystem size of each machines(because this label is not in without):
sum without(device, fstype, mountpoint)(node_filesystem_size_bytes)
→ Gets size of the biggest mounted filesystem on each machine:
# Using by
max by(instance)(node_filesystem_size_bytes)
# Using without
max without(device, fstype, mountpoint)(node_filesystem_size_bytes)
→ Gets change in memory usage in the Node exporter over past hour
process_resident_memory_bytes{job="node"}
-
process_resident_memory_bytes{job="node"} offset 1h
→ Gets amount of network traffic received per second:
rate(node_network_receive_bytes_total[5m])
→ Gets total bytes received per machine per second:
sum by(instance)(rate(node_network_receive_bytes_total[5m]))
Consider Prometheus 2.2.1 exposing a histogram metric called prometheus_tsdb_compaction_duration_seconds that tracks how many seconds compaction takes for the time series database. Obviously it will under the hood expose thre counter metrics.
→ Gets total number of times compaction happens per second per instance
sum by(instance)(rate(prometheus_tsdb_compaction_duration_seconds_count[5m]))
→ Gets average compaction seconds per instance over a time period of 5m
sum by(instance)(rate(prometheus_tsdb_compaction_duration_seconds_sum[5m]))
/
sum by(instance)(rate(prometheus_tsdb_compaction_duration_seconds_count[5m]))
→ Gets 90%ile value of compaction seconds over a of 1d:
histogram_quantile(
0.90,
rate(prometheus_tsdb_compaction_duration_seconds_bucket[1d]))
→ Both without & by work on gauge type metric as in many examples above
→ sum Adds of all the values in a group and returns that as a value for the group.
→ count Counts the number of time series in a group, and returns it as the value for the group.
→ avg Returns the average of the values4 of the time series in the group as the value for the group.
→ stddev, stdvar
→ min, max The min and max aggregators return the minimum or maximum value within a group as the value of the group.
→ topk Returns the K time series with the biggest values
topk without(device, fstype, mountpoint)(K, node_filesystem_size_bytes)
→ bottomk Returns the K time series with the lowest values
→ quantile Returns the specified quantile of the values of the group as the group’s return value. Gets 90th percentile of the system mode CPU usage across the different CPUs in each of machines:
quantile without(cpu)(0.9, rate(node_cpu_seconds_total{mode="system"}[5m]))
→ count_values Builds a frequency histogram of the values of the time series in the group, with the count of each value as the value of the output time series and the original value as a new label. E.g. given following time series:
software_version{instance="a",job="j"} 7
software_version{instance="b",job="j"} 4
software_version{instance="c",job="j"} 8
software_version{instance="d",job="j"} 4
software_version{instance="e",job="j"} 7
software_version{instance="f",job="j"} 4
Following query:
count_values without(instance)("version", software_version)
Returns:
{job="j",version="7"} 2
{job="j",version="8"} 1
{job="j",version="4"} 3