TracNav
check_proc: Check Process
This quick guide assumes you already have NRPE working and setup properly. If not follow the getting started guide and the NRPE getting started guide first before you start looking into this.
0. Introduction
To check for existence of a process there is one built-in check we can use. There is also external scripts but this guide focuses on the CheckProcState command.
The first thing to do is to figure out which process you want to check. I will use firefox in this guide but you can choose whatever you want. During testing and debugging of the command it is a good idea if you are able to stop and start the process on a whim so choose something noncritical (you can always change it after everything is working properly).
So first off we fire up the task manager (either [Ctrl]+[Shift]+[Esc], Windows+r->"taskmgr" or whatever you fancy) and go to the programs tab.
Then find the Program you want to check and right click and select "Go to Process" (I use Swedish windows in the screen shot so don't be confused).
Then you can from the "Name" column get the process name (in this case firefox.exe).
Now we know the process name of the program we want to check so we can continue on with actually checking it. But first it might be good to notice that it is often easier to find things using the SysInternals? Process Explorer tool (found here) then Task Manager but the up-side of Task Manager is that is it included by default on most (all?) Windows.
Now off we go with adding the check as I said initially we will use the CheckProcState command so the next step is to figure out which module the command resides in (The reference page would be this one: CheckProcState). This simplest way to do this is to read up on the reference page for the command. In this case it is CheckSystem so we need to load that module for this to work.
Now we have (in theory) a working command and we know what we are looking for so the only thing left is to check it right? (almost)
1 Playing with NSClient++
Lets start off by adding the CheckSystem module to the modules section of NSClient++ like so:
[/modules] CheckSystem=enabled
Then lets fire up NSClient++ in test mode like so:
nscp test
And watch it boot up. Important here is to make sure it actually loads the CheckSystem module.
D:\source\NSCP-stable\stage\Win32\binaries>nscp test Launching test mode - client mode d NSClient++.cpp(1106) Enabling debug mode... d NSClient++.cpp(494) Attempting to start NSCLient++ - 0.3.7.82 2009-07-19 d NSClient++.cpp(897) Loading plugin: CheckSystem... l NSClient++.cpp(600) NSCLient++ - 0.3.7.82 2009-07-19 Started! d \PDHCollector.cpp(66) Autodetected w2k or later, using w2k PDH counters. l NSClient++.cpp(402) Using settings from: INI-file d \PDHCollector.cpp(103) Using index to retrive counternames l NSClient++.cpp(403) Enter command to inject or exit to terminate... d \PDHCollector.cpp(123) Found countername: CPU: \Processor(_total)\% processortid d \PDHCollector.cpp(124) Found countername: UPTIME: \System\Tid sedan systemstart d \PDHCollector.cpp(125) Found countername: MCL: \Minne\Dedikationsgrõns d \PDHCollector.cpp(126) Found countername: MCB: \Minne\Dedicerade byte
And the key here is the following line:
d NSClient++.cpp(897) Loading plugin: CheckSystem...
This tells us that it attempted to load the CheckSystem module (and unless we have any errors (denoted by e) it means we are good to go.
Now lets try the CheckProcState command by typing the following:
CheckProcState
CheckProcState d NSClient++.cpp(1034) Injecting: CheckProcState: d NSClient++.cpp(1070) Injected Result: WARNING 'ERROR: Missing argument exception. d NSClient++.cpp(1071) Injected Performance Result: '' WARNING:ERROR: Missing argument exception.
It told us we are missing an argument (not very help full really, could at least have told us which one; right?).
Anyways off to the CheckProcState page and read up on arguments.
| Option | Values | Description |
| process=state | A process name and a state the process should have. The state can be either started or stopped. If no state is given started is assumed. The name is the name of the executable. |
This means we should enter a process name and the state we want to find it in.
In our case (if you remember from the introduction) the process is called firefox.exe and states is stopped or started depending on what we want to check and we want to check if it is started so our choice is started. Thus;
- process=firefox.exe
- state=started
So lets try to enter the following:
CheckProcState firefox.exe=started
d NSClient++.cpp(1034) Injecting: CheckProcState: firefox.exe=started d \CheckSystem.cpp(809) PROC>>> enumerate_processes d \CheckSystem.cpp(809) PROC>>> enable_token_privilege d \CheckSystem.cpp(812) PROC<<<enable_token_privilege d \CheckSystem.cpp(809) PROC>>> FEnumProcesses d \CheckSystem.cpp(812) PROC<<<FEnumProcesses d \CheckSystem.cpp(809) PROC>>> describe_pid d \CheckSystem.cpp(812) PROC<<<describe_pid ... d \CheckSystem.cpp(812) PROC<<<enumerate_processes d \checkHelpers.hpp(683) Missing bounds for check: firefox.exe d \checkHelpers.hpp(683) Missing bounds for check: firefox.exe d NSClient++.cpp(1070) Injected Result: OK 'OK: All processes are running.' d NSClient++.cpp(1071) Injected Performance Result: '' OK:OK: All processes are running.
Whooha?
What happen here?
Well, the "good" thing about the /test mode is the debug output (and sometimes as this also the bad). No matter the end is what we need so lets just ignore the rest:
d NSClient++.cpp(1070) Injected Result: OK 'OK: All processes are running.' d NSClient++.cpp(1071) Injected Performance Result: '' OK:OK: All processes are running.
Seems the check works?
Just to verify it lets try a process which does not currently run. You can of course go ahead and kill firefox but in my case it would not be a good idea as I am using it to type this so I will use another process instead.
CheckProcState foo_bar_is_not_running=started
This time we get:
... d NSClient++.cpp(1070) Injected Result: CRITICAL 'CRITICAL: foo_bar_is_not_running: stopped (critical)' d NSClient++.cpp(1071) Injected Performance Result: '' CRITICAL:CRITICAL: foo_bar_is_not_running: stopped (critical)
Sounds good. In this case we get a critical return.
2. Configuring NSClient++
The main thing in NSClient++ is to load the CheckSystem module but we also need NRPEListener to be able to call this (via check_nrpe) from Nagios and in addition to this the FileLogger is a nice way to see if things go wrong.
[/modules] CheckSystem=1 NRPEServer=1
Then we have two options:
- Use CheckExternalScripts alias function to define the command locally
- Enable allow arguments for the NRPEListener module.
I will in this example use the allow_arguments options since it is simpler. To enable this find the NRPE section and change it to 1 like so:
[/settings/NRPE/server] allow arguments=1 ; here you will probably have more settings
Now we are off to testing the command from the Nagios server.
3. Testing the command
First off before we head over to the Nagios box make sure you have NSClient++ running in /test mode (as before).
Now from the Nagios server find your check_nrpe location and start it to make sure NSClient++ communication is "working" like so:
mickem@gotrek:~/nsc$ /usr/lib/nagios/plugins/check_nrpe -H 192.168.0.104 I (0.3.7.82 2009-07-19) seem to be doing fine...
Now we repeat the same things we did before but from Nagios so first off is the "missing arguments" command:
mickem@gotrek:~/nsc$ /usr/lib/nagios/plugins/check_nrpe -H 192.168.0.104 -c CheckProcState ERROR: Missing argument exception.
And no big surprise we got "ERROR: Missing argument exception." just as we expected. To check the actual "returned status" you can (if you are using bash) run the following command JUST after you exit NSClient++.
mickem@gotrek:~/nsc$ echo $? 3
In this case 3 means "unknown":
| 0 | OK |
| 1 | WARNING |
| 2 | CRITICAL |
| 3 | UNKNOWN |
Next off is the "proper command"
mickem@gotrek:~/nsc$ /usr/lib/nagios/plugins/check_nrpe -H 192.168.0.104 -c CheckProcState -a firefox.exe=started OK: All processes are running. mickem@gotrek:~/nsc$ echo $? 0
Humm, almost there?
Lets add another quite nice option: ShowAll
| Option | Values | Description |
| ShowAll | A flag to toggle if all process states should be listed. |
What this does is list all process states even if they are ok (the default is to only list "broken" states)
mickem@gotrek:~/nsc$ /usr/lib/nagios/plugins/check_nrpe -H 192.168.0.104 -c CheckProcState -a ShowAll firefox.exe=started OK: firefox.exe: 1
The 1 here is the number of processes running.
So now we are done with playing around lets head over to Nagios and add the configuration there to get it all up and running.
4. Configuring Nagios
on the nagios side we will start with a service definition for the host (assumed to be already defined). If you have followed the guide you should something along the following lines:
define service{
use generic-service
host_name windowshost
service_description CPU Load
check_command check_nrpe!alias_cpu
}
So lets start off by copying that with some slight modifications.
define service{
use generic-service
host_name windowshost
service_description CPU Firefox
check_command check_nrpe_proc!firefox.exe=started
}
The modifications were on the following two lines:
service_description CPU Firefox check_command check_nrpe_proc!firefox.exe=started
Where service_description is used as a name or alias or description in Nagios. and check_command is the command to run. Since we use Nagios based command definitions (the other optio would be to define them all inside NSClient++) we need to add a new command for each check we want to run (or endup with a lot of duplicate configuration). Exactly where to put the "split" between duplication is hard to say in a generic context so you will have to decide this as you go along. Another option would in this case have been to have a dedicated check_nrpe_proc_firefox instead of a check_nrpe_proc.
Anyways, off to define the actual command. Again we copy-paste but this time from your commands.cfg file.
define command{
command_name check_nrpe_proc
command_line $USER1$/check_nrpe -H $HOSTNAME$ -c CheckProcState -a ShowAll $ARG1$
}
And thats all we need to do.
5. Done?
That was not do hard now was it?
This was of course just a very basic guide and there is a lot more things to do with even such a simple check. But this will be all for now!
Attachments (4)
- screen_capture_000.png (34.9 KB) - added by mickem 4 years ago.
- screen_capture_003.png (10.2 KB) - added by mickem 4 years ago.
- screen_capture_002.png (37.5 KB) - added by mickem 4 years ago.
- ProcessExplorer.jpg (44.7 KB) - added by mickem 4 years ago.
Download all attachments as: .zip












