TracNav
Modules
- overview
Checks and commands
- overview
CauseCrashes...
CheckNSCP...
CheckDisk...
CheckLogFile...
CheckEventLog...
CheckSystem
CheckHelpers...
CheckTaskSched...
CheckTaskSched2...
CheckWMI...
Scripting Languages?...
Servers and protocols?...
Utilities and tools?...
Documentation...
Guides...
Information...
Sponsoring...
Donate...
CheckCPU
CheckCPU is part of the wiki:CheckSystem module.
This check calculates an average of CPU usage for a specified period of time. The data is always collected in the background and the size and interval is configured from the CPUBufferSize and CheckResolution? options. A request has one or more options described in the table below.
| Option | Values | Description |
| warn | load in % | Load to go above to generate a warning. |
| crit | load in % | Load to go above to generate a critical state. |
| Time | time with optional prefix | The time to calculate average over. Multiple time= entries can be given - generating multiple CPU usage summaries and multiple warn/crits. |
| nsclient | Flag to make the plug in run in NSClient compatibility mode | |
| ShowAll | none, long | Add this option to show info even if no errors are detected. Set it to long to show detailed information. |
Time can use any of the following postfixes. w=week, d=day, h=hour, m=minute and s=second.
Configuration
The size and frequency of sampled CPU data can be configured and for details refer to the configuration section for the CheckSystem module
FAQ
- Q: "NSClient - ERROR: Could not get data for 60 perhaps we don"t collect data this far back?"
- A: See the configuration section on how to configure the "CPUBufferSize" it has to be LARGER then your collection time here.
- Q: How does it handle multi CPU machines?
- A: The returned value is the average value of the CPU load of all the processors.
Examples
Sample Command
Check that the CPU load for various times is below 80%:
Sample Command:
CheckCPU warn=80 crit=90 time=20m time=10s time=4
OK: CPU Load ok.Nagios Configuration:
define command {
command_name <<CheckCPU>>
command_line check_nrpe -H $HOSTADDRESS$ -p 5666 -c CheckCPU -a warn=$ARG1$ crit=$ARG2$ time=20m time=10s time=4
}
<<CheckCPU>> 80!90
From Commandline (with NRPE):
check_nrpe -H IP -p 5666 -c CheckCPU -a warn=80 crit=90 time=20m time=10s time=4
Multiple Time entry
Showing multiple time entry usage and returned data
Sample Command:
CheckCPU warn=2 crit=80 time=5m time=1m time=10s
WARNING: WARNING: 5m: average load 8% > warningNagios Configuration:
define command {
command_name <<CheckCPU>>
command_line check_nrpe -H $HOSTADDRESS$ -p 5666 -c CheckCPU -a warn=2 crit=$ARG1$ time=5m time=1m time=10s
}
<<CheckCPU>> 80
From Commandline (with NRPE):
check_nrpe -H IP -p 5666 -c CheckCPU -a warn=2 crit=80 time=5m time=1m time=10s
check_load
Check CPU load with intervals like known from Linux/Unix? (with example thresholds):
Sample Command:
CheckCPU warn=100 crit=100 time=1 warn=95 crit=99 time=5 warn=90 crit=95 time=15
OK: ...Nagios Configuration:
define command {
command_name <<CheckCPU>>
command_line check_nrpe -H $HOSTADDRESS$ -p 5666 -c CheckCPU -a warn=100 crit=100 time=1 warn=95 crit=99 time=5 warn=90 crit=95 time=15
}
<<CheckCPU>>
From Commandline (with NRPE):
check_nrpe -H IP -p 5666 -c CheckCPU -a warn=100 crit=100 time=1 warn=95 crit=99 time=5 warn=90 crit=95 time=15








