- Notifications
You must be signed in to change notification settings - Fork2.6k
Exporter for machine metrics
License
prometheus/node_exporter
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Prometheus exporter for hardware and OS metrics exposed by *NIX kernels, writtenin Go with pluggable metric collectors.
TheWindows exporter is recommended for Windows users.To expose NVIDIA GPU metrics,prometheus-dcgmcan be used.
If you are new to Prometheus andnode_exporter there is asimple step-by-step guide.
Thenode_exporter listens on HTTP port 9100 by default. See the--help output for more options.
For automated installs withAnsible, there is thePrometheus Community role.
Thenode_exporter is designed to monitor the host system. Deploying in containers requiresextra care in order to avoid monitoring the container itself.
For situations where containerized deployment is needed, some extra flags must be used to allowthenode_exporter access to the host namespaces.
Be aware that any non-root mount points you want to monitor will need to be bind-mountedinto the container.
If you start container for host monitoring, specifypath.rootfs argument.This argument must match path in bind-mount of host root. The node_exporter will usepath.rootfs as prefix to access host filesystem.
docker run -d \ --net="host" \ --pid="host" \ -v"/:/host:ro,rslave" \ quay.io/prometheus/node-exporter:latest \ --path.rootfs=/host
For Docker compose, similar flag changes are needed.
---version:'3.8'services:node_exporter:image:quay.io/prometheus/node-exporter:latestcontainer_name:node_exportercommand: -'--path.rootfs=/host'network_mode:hostpid:hostrestart:unless-stoppedvolumes: -'/:/host:ro,rslave'
On some systems, thetimex collector requires an additional Docker flag,--cap-add=SYS_TIME, in order to access the required syscalls.
There is varying support for collectors on each operating system. The tablesbelow list all existing collectors and the supported systems.
Collectors are enabled by providing a--collector.<name> flag.Collectors that are enabled by default can be disabled by providing a--no-collector.<name> flag.To enable only some specific collector(s), use--collector.disable-defaults --collector.<name> ....
A few collectors can be configured to include or exclude certain patterns using dedicated flags. The exclude flags are used to indicate "all except", while the include flags are used to say "none except". Note that these flags are mutually exclusive on collectors that support both.
Example:
--collector.filesystem.mount-points-exclude=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)
List:
| Collector | Scope | Include Flag | Exclude Flag |
|---|---|---|---|
| arp | device | --collector.arp.device-include | --collector.arp.device-exclude |
| cpu | bugs | --collector.cpu.info.bugs-include | N/A |
| cpu | flags | --collector.cpu.info.flags-include | N/A |
| diskstats | device | --collector.diskstats.device-include | --collector.diskstats.device-exclude |
| ethtool | device | --collector.ethtool.device-include | --collector.ethtool.device-exclude |
| ethtool | metrics | --collector.ethtool.metrics-include | N/A |
| filesystem | fs-types | --collector.filesystem.fs-types-include | --collector.filesystem.fs-types-exclude |
| filesystem | mount-points | --collector.filesystem.mount-points-include | --collector.filesystem.mount-points-exclude |
| hwmon | chip | --collector.hwmon.chip-include | --collector.hwmon.chip-exclude |
| hwmon | sensor | --collector.hwmon.sensor-include | --collector.hwmon.sensor-exclude |
| interrupts | name | --collector.interrupts.name-include | --collector.interrupts.name-exclude |
| netdev | device | --collector.netdev.device-include | --collector.netdev.device-exclude |
| qdisk | device | --collector.qdisk.device-include | --collector.qdisk.device-exclude |
| slabinfo | slab-names | --collector.slabinfo.slabs-include | --collector.slabinfo.slabs-exclude |
| sysctl | all | --collector.sysctl.include | N/A |
| systemd | unit | --collector.systemd.unit-include | --collector.systemd.unit-exclude |
| Name | Description | OS |
|---|---|---|
| arp | Exposes ARP statistics from/proc/net/arp. | Linux |
| bcache | Exposes bcache statistics from/sys/fs/bcache/. | Linux |
| bonding | Exposes the number of configured and active slaves of Linux bonding interfaces. | Linux |
| btrfs | Exposes btrfs statistics | Linux |
| boottime | Exposes system boot time derived from thekern.boottime sysctl. | Darwin, Dragonfly, FreeBSD, NetBSD, OpenBSD, Solaris |
| conntrack | Shows conntrack statistics (does nothing if no/proc/sys/net/netfilter/ present). | Linux |
| cpu | Exposes CPU statistics | Darwin, Dragonfly, FreeBSD, Linux, Solaris, OpenBSD |
| cpufreq | Exposes CPU frequency statistics | Linux, Solaris |
| diskstats | Exposes disk I/O statistics. | Darwin, Linux, OpenBSD |
| dmi | Expose Desktop Management Interface (DMI) info from/sys/class/dmi/id/ | Linux |
| edac | Exposes error detection and correction statistics. | Linux |
| entropy | Exposes available entropy. | Linux |
| exec | Exposes execution statistics. | Dragonfly, FreeBSD |
| fibrechannel | Exposes fibre channel information and statistics from/sys/class/fc_host/. | Linux |
| filefd | Exposes file descriptor statistics from/proc/sys/fs/file-nr. | Linux |
| filesystem | Exposes filesystem statistics, such as disk space used. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD |
| hwmon | Expose hardware monitoring and sensor data from/sys/class/hwmon/. | Linux |
| infiniband | Exposes network statistics specific to InfiniBand and Intel OmniPath configurations. | Linux |
| ipvs | Exposes IPVS status from/proc/net/ip_vs and stats from/proc/net/ip_vs_stats. | Linux |
| loadavg | Exposes load average. | Darwin, Dragonfly, FreeBSD, Linux, NetBSD, OpenBSD, Solaris |
| mdadm | Exposes statistics about devices in/proc/mdstat (does nothing if no/proc/mdstat present). | Linux |
| meminfo | Exposes memory statistics. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD |
| netclass | Exposes network interface info from/sys/class/net/ | Linux |
| netdev | Exposes network interface statistics such as bytes transferred. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD |
| netisr | Exposes netisr statistics | FreeBSD |
| netstat | Exposes network statistics from/proc/net/netstat. This is the same information asnetstat -s. | Linux |
| nfs | Exposes NFS client statistics from/proc/net/rpc/nfs. This is the same information asnfsstat -c. | Linux |
| nfsd | Exposes NFS kernel server statistics from/proc/net/rpc/nfsd. This is the same information asnfsstat -s. | Linux |
| nvme | Exposes NVMe info from/sys/class/nvme/ | Linux |
| os | Expose OS release info from/etc/os-release or/usr/lib/os-release | any |
| powersupplyclass | Exposes Power Supply statistics from/sys/class/power_supply | Linux |
| pressure | Exposes pressure stall statistics from/proc/pressure/. | Linux (kernel 4.20+ and/orCONFIG_PSI) |
| rapl | Exposes various statistics from/sys/class/powercap. | Linux |
| schedstat | Exposes task scheduler statistics from/proc/schedstat. | Linux |
| selinux | Exposes SELinux statistics. | Linux |
| sockstat | Exposes various statistics from/proc/net/sockstat. | Linux |
| softnet | Exposes statistics from/proc/net/softnet_stat. | Linux |
| stat | Exposes various statistics from/proc/stat. This includes boot time, forks and interrupts. | Linux |
| tapestats | Exposes statistics from/sys/class/scsi_tape. | Linux |
| textfile | Exposes statistics read from local disk. The--collector.textfile.directory flag must be set. | any |
| thermal | Exposes thermal statistics likepmset -g therm. | Darwin |
| thermal_zone | Exposes thermal zone & cooling device statistics from/sys/class/thermal. | Linux |
| time | Exposes the current system time. | any |
| timex | Exposes selected adjtimex(2) system call stats. | Linux |
| udp_queues | Exposes UDP total lengths of the rx_queue and tx_queue from/proc/net/udp and/proc/net/udp6. | Linux |
| uname | Exposes system information as provided by the uname system call. | Darwin, FreeBSD, Linux, OpenBSD |
| vmstat | Exposes statistics from/proc/vmstat. | Linux |
| watchdog | Exposes statistics from/sys/class/watchdog | Linux |
| xfs | Exposes XFS runtime statistics. | Linux (kernel 4.4+) |
| zfs | ExposesZFS performance statistics. | FreeBSD,Linux, Solaris |
node_exporter also implements a number of collectors that are disabled by default. Reasons for this vary bycollector, and may include:
- High cardinality
- Prolonged runtime that exceeds the Prometheus
scrape_intervalorscrape_timeout - Significant resource demands on the host
You can enable additional collectors as desired by adding them to yourinit system's or service supervisor's startup configuration fornode_exporter but caution is advised. Enable at most one at a time,testing first on a non-production system, then by hand on a singleproduction node. When enabling additional collectors, you shouldcarefully monitor the change by observing the scrape_duration_seconds metric to ensure that collection completesand does not time out. In addition, monitor thescrape_samples_post_metric_relabeling metric to see the changes incardinality.
| Name | Description | OS |
|---|---|---|
| buddyinfo | Exposes statistics of memory fragments as reported by /proc/buddyinfo. | Linux |
| cgroups | A summary of the number of active and enabled cgroups | Linux |
| cpu_vulnerabilities | Exposes CPU vulnerability information from sysfs. | Linux |
| devstat | Exposes device statistics | Dragonfly, FreeBSD |
| drm | Expose GPU metrics using sysfs / DRM,amdgpu is the only driver which exposes this information through DRM | Linux |
| drbd | Exposes Distributed Replicated Block Device statistics (to version 8.4) | Linux |
| ethtool | Exposes network interface information and network driver statistics equivalent toethtool,ethtool -S, andethtool -i. | Linux |
| interrupts | Exposes detailed interrupts statistics. | Linux, OpenBSD |
| ksmd | Exposes kernel and system statistics from/sys/kernel/mm/ksm. | Linux |
| lnstat | Exposes stats from/proc/net/stat/. | Linux |
| logind | Exposes session counts fromlogind. | Linux |
| meminfo_numa | Exposes memory statistics from/sys/devices/system/node/node[0-9]*/meminfo,/sys/devices/system/node/node[0-9]*/numastat. | Linux |
| mountstats | Exposes filesystem statistics from/proc/self/mountstats. Exposes detailed NFS client statistics. | Linux |
| network_route | Exposes the routing table as metrics | Linux |
| pcidevice | Exposes pci devices' information including their link status and parent devices. | Linux |
| perf | Exposes perf based metrics (Warning: Metrics are dependent on kernel configuration and settings). | Linux |
| processes | Exposes aggregate process statistics from/proc. | Linux |
| qdisc | Exposesqueuing discipline statistics | Linux |
| slabinfo | Exposes slab statistics from/proc/slabinfo. Note that permission of/proc/slabinfo is usually 0400, so set it appropriately. | Linux |
| softirqs | Exposes detailed softirq statistics from/proc/softirqs. | Linux |
| sysctl | Expose sysctl values from/proc/sys. Use--collector.sysctl.include(-info) to configure. | Linux |
| swap | Expose swap information from/proc/swaps. | Linux |
| systemd | Exposes service and system status fromsystemd. | Linux |
| tcpstat | Exposes TCP connection status information from/proc/net/tcp and/proc/net/tcp6. (Warning: the current version has potential performance issues in high load situations.) | Linux |
| wifi | Exposes WiFi device and station statistics. | Linux |
| xfrm | Exposes statistics from/proc/net/xfrm_stat | Linux |
| zoneinfo | Exposes NUMA memory zone metrics. | Linux |
These collectors are deprecated and will be removed in the next major release.
| Name | Description | OS |
|---|---|---|
| ntp | Exposes local NTP daemon health to checktime | any |
| runit | Exposes service status fromrunit. | any |
| supervisord | Exposes service status fromsupervisord. | any |
Theperf collector may not work out of the box on some Linux systems due to kernelconfiguration and security settings. To allow access, set the followingsysctlparameter:
sysctl -w kernel.perf_event_paranoid=X- 2 allow only user-space measurements (default since Linux 4.6).
- 1 allow both kernel and user measurements (default before Linux 4.6).
- 0 allow access to CPU-specific data but not raw tracepoint samples.
- -1 no restrictions.
Depending on the configured value different metrics will be available, for mostcases0 will provide the most complete set. For more information seeman 2 perf_event_open.
By default, theperf collector will only collect metrics of the CPUs thatnode_exporter is running on (ieruntime.NumCPU. If this isinsufficient (e.g. if you runnode_exporter with its CPU affinity set tospecific CPUs), you can specify a list of alternate CPUs by using the--collector.perf.cpus flag. For example, to collect metrics on CPUs 2-6, youwould specify:--collector.perf --collector.perf.cpus=2-6. The CPUconfiguration is zero indexed and can also take a stride value; e.g.--collector.perf --collector.perf.cpus=1-10:5 would collect on CPUs1, 5, and 10.
Theperf collector is also able to collecttracepointcounts when using the--collector.perf.tracepoint flag. Tracepoints can befound usingperf list orfrom debugfs. And example usage of this would be--collector.perf.tracepoint="sched:sched_process_exec".
Thesysctl collector can be enabled with--collector.sysctl. It supports exposing numeric sysctl valuesas metrics using the--collector.sysctl.include flag and string values as info metrics by using the--collector.sysctl.include-info flag. The flags can be repeated. For sysctl with multiple numeric values,an optional mapping can be given to expose each value as its own metric. Otherwise anindex label is usedto identify the different fields.
Using--collector.sysctl.include=vm.user_reserve_kbytes:vm.user_reserve_kbytes = 131072 ->node_sysctl_vm_user_reserve_kbytes 131072
A sysctl can contain multiple values, for example:
net.ipv4.tcp_rmem = 40961310726291456Using--collector.sysctl.include=net.ipv4.tcp_rmem the collector will expose:
node_sysctl_net_ipv4_tcp_rmem{index="0"} 4096node_sysctl_net_ipv4_tcp_rmem{index="1"} 131072node_sysctl_net_ipv4_tcp_rmem{index="2"} 6291456If the indexes have defined meaning like in this case, the values can be mapped to multiple metrics by appending the mapping to the --collector.sysctl.include flag:Using--collector.sysctl.include=net.ipv4.tcp_rmem:min,default,max the collector will expose:
node_sysctl_net_ipv4_tcp_rmem_min 4096node_sysctl_net_ipv4_tcp_rmem_default 131072node_sysctl_net_ipv4_tcp_rmem_max 6291456String values need to be exposed as info metric. The user selects them by using the--collector.sysctl.include-info flag.
kernel.core_pattern = core ->node_sysctl_info{key="kernel.core_pattern_info", value="core"} 1
Given the following sysctl:
kernel.seccomp.actions_avail = kill_process kill_thread trap errno trace log allowSetting--collector.sysctl.include-info=kernel.seccomp.actions_avail will yield:
node_sysctl_info{key="kernel.seccomp.actions_avail", index="0", value="kill_process"} 1node_sysctl_info{key="kernel.seccomp.actions_avail", index="1", value="kill_thread"} 1...Thetextfile collector is similar to thePushgateway,in that it allows exporting of statistics from batch jobs. It can also be usedto export static metrics, such as what role a machine has. The Pushgatewayshould be used for service-level metrics. Thetextfile module is for metricsthat are tied to a machine.
To use it, set the--collector.textfile.directory flag on thenode_exporter commandline. Thecollector will parse all files in that directory matching the glob*.promusing thetextformat.Note: Timestamps are not supported.
To atomically push completion time for a cron job:
echo my_batch_job_completion_time $(date +%s) > /path/to/directory/my_batch_job.prom.$$mv /path/to/directory/my_batch_job.prom.$$ /path/to/directory/my_batch_job.promTo statically set roles for a machine using labels:
echo 'role{role="application_server"} 1' > /path/to/directory/role.prom.$$mv /path/to/directory/role.prom.$$ /path/to/directory/role.promThenode_exporter will expose all metrics from enabled collectors by default. This is the recommended way to collect metrics to avoid errors when comparing metrics of different families.
For advanced use thenode_exporter can be passed an optional list of collectors to filter metrics. The parameterscollect[] andexclude[] can be used multiple times (but cannot be combined). In Prometheus configuration you can use this syntax under thescrape config.
Collect onlycpu andmeminfo collector metrics:
params: collect[]: - cpu - meminfoCollect all enabled collector metrics but excludenetdev:
params: exclude[]: - netdevThis can be useful for having different Prometheus servers collect specific metrics from nodes.
Prerequisites:
- Go compiler
- RHEL/CentOS:
glibc-staticpackage.
Building:
git clone https://github.com/prometheus/node_exporter.gitcd node_exportermake build./node_exporter <flags>To see all available configuration flags:
./node_exporter -hmake testEXPERIMENTAL
The exporter supports TLS via a new web configuration file.
./node_exporter --web.config.file=web-config.ymlSee theexporter-toolkit web-configuration for more details.
About
Exporter for machine metrics
Topics
Resources
License
Code of conduct
Contributing
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.