NSClient++ Help (#1) - Roundtrip respone time from CheckCounter seems slower then other commands [SOLVED] (#583) - Message List
I'm trying to determine why NSClient++ takes about a second to run a CheckCounter commands. If i'm calling nsclient++ from check_nrpe the command typically takes about 1.2 seconds to run. Here are some of the response times i'm getting.
time check_nrpe -n -H myhost -c CheckCounter -a "Counter:read=\\System\\File Read Bytes/sec" OK all counters within bounds.|'read=200.124296;0;0;
real 0m1.077s user 0m0.000s sys 0m0.003s
time check_nrpe -n -H myhost -c CheckCounter -a "Counter:proc=\\LogicalDisk(D:)
% Free Space"
OK all counters within bounds.|'proc'=96.959773;0;0;
real 0m1.088s user 0m0.001s sys 0m0.002s
Other commands seem to run much quicker even with SSL enabled
time check_nrpe -H myhost -c CheckCPU -a time=5m OK CPU Load ok.|'5m'=35%;0;0;
real 0m0.349s user 0m0.004s sys 0m0.001s
time check_nrpe -H myhosts -u -t 60 -c CheckWMI -a "Query:name=SELECT Name FROM Win32_PerfRawData_PerfDisk_LogicalDisk" Name=C:Name=D:Name=_Total|'name'=3;0;0;
real 0m0.238s user 0m0.003s sys 0m0.004s
Any help would be appreciated.
-
Message #1775
Some counters require a "second to pass" to get average values, I am (simply) to lazy to treat them differently ...
So code is like so:
pdh.open(); if (bCheckAverages) { pdh.collect(); Sleep(1000); } pdh.gatherData(); pdh.close();Michael Medin
mickem05/25/10 22:25:47 (3 years ago)-
Message #1776
Might wanna clearify there is an option to disable it... but you have to add it...
MAP_OPTIONS_BOOL_EX(_T("Averages"), bCheckAverages, _T("true"), _T("false"))Defaults to true so: Averages=false should speed things up... but again... wont always work...
Michael Medin
mickem05/25/10 22:27:47 (3 years ago)-
Message #1778
Ok. so "Defaults to true so: Averages=false should speed things up... but again... wont always work... " so "Averages=false" is this something that needs to be set in the nsc.ini
swright@…05/25/10 22:57:32 (3 years ago) -
Message #1812
Can I ask what you meant by "won't always work" ?
We're finding we sometimes get this error when using Averages=false. One one windows XP PC it's fine but on one windows 2003 PC it fails (OK on both PCs without Averages=false). Using version 0.3.7
checkCounter "\Processor(_Total)\% Idle Time" MinCrit=-1 ShowAll Averages=false d NSClient++.cpp(1073) Injecting: checkCounter: \Processor(_Total)\% Idle Time, MinCrit=-1, ShowAll, Averages=false e \CheckSystem.cpp(1091) ERROR: \Processor(_Total)\% Idle Time: PdhGetFormattedC ounterValue failed: -1073738810: The data is not valid.
(\Processor(_Total)\% Idle Time|\Processor(_Total)\% Idle Time)
d NSClient++.cpp(1109) Injected Result: WARNING 'ERROR: \Processor(_Total)\% Idl e Time: PdhGetFormattedCounterValue? failed: -1073738810: The data is not valid.
(\Processor(_Total)\% Idle Time|\Processor(_Total)\% Idle Time)'
lovedada06/09/10 12:18:17 (3 years ago)-
Message #1813
Humm... sounds doubtful counter should be a rate I think which % Idel Time is not, right?
This is what I refer to:
'"Obtaining the value of rate counters such as Page faults/sec requires that PdhCollectQueryData? be called twice, with a specific time interval between the two calls, before calling PdhGetFormattedCounterValue?. Call Sleep to implement the waiting period between the two calls to PdhCollectQueryData?."'
Michael Medin
mickem06/09/10 17:34:33 (3 years ago)-
Message #1814
Thank you for that and sorry about the excessive length of this post.
I got there myself after some confusion. One problem is that I can't find any documentation on which counters require the two calls with a sleep in the middle and which do not.
One comment I would make on NSClient++ is that although the windows API and indeed the PDHQuery class allow for multiple counters to be retrieved in one query, the CheckSystem::checkCounter() method always creates separate queries (with separate sleeps) even when retrieving multiple counters in one request.
Although the scope for retrieving multiple counters in one request is severely limited by the 1024 byte limit in any case.
There do also seem to be platform-specific differences, I ran this C++ test program on windows XP and windows 2003 trying to collect a counter value with only one call to PdhCollectQueryData?. I'm assuming for this counter it is invalid (even though it is not a rate) as on 2003 the call failed - on XP it succeeded but returned garbage (which is nasty).
$ cat DLTest.cpp #include <iostream> #include <pdh.h> #using <mscorlib.dll> #using <System.dll> using namespace System; using namespace std; void main(int argc, char** argv) { PDH_STATUS status; HQUERY hQuery_; if( (status = PdhOpenQuery( NULL, 0, &hQuery_ )) != ERROR_SUCCESS) printf("PdhOpenQuery failedm status=%d\n", status); HCOUNTER hCounter_; if ((status = PdhAddCounter(hQuery_, "\\Processor(_Total)\\% Idle Time", 0, &hCounter_)) != ERROR_SUCCESS) { //if ((status = PdhAddCounter(hQuery_, "\\Memory\\Available Mbytes", 0, &hCounter_)) != ERROR_SUCCESS) { hCounter_ = NULL; printf("PdhAddCounter failed, status=%d\n", status); } if (hCounter_ == NULL) printf("Counter is null!\n"); if ((status = PdhCollectQueryData(hQuery_)) != ERROR_SUCCESS) printf("PdhCollectQueryData failed: %d\n", status); Sleep(1000); if ((status = PdhCollectQueryData(hQuery_)) != ERROR_SUCCESS) printf("PdhCollectQueryData failed: %d\n", status); PDH_FMT_COUNTERVALUE data_; if ((status = PdhGetFormattedCounterValue(hCounter_, PDH_FMT_DOUBLE, NULL, &data_)) != ERROR_SUCCESS) { printf("PdhGetFormattedCounterValue failed, status=%d\n", status); } printf("Data is %f\n", data_.doubleValue); if( (status = PdhCloseQuery(hQuery_)) != ERROR_SUCCESS) printf("PdhCloseQuery failed, status=%d\n", status); } $Result on windows XP (correct value is close to 50%)
Two calls plus sleep as in NSClient with "Averages=true"
============================================= $ ./DLTestAvg.exe Data is 45.332229 $ ./DLTestAvg.exe Data is 45.404375 $ ./DLTestAvg.exe Data is 47.676999 $ $
One call, no sleep as in NSClient with "Averages=false"
============================================= $ ./DLTestNoAvg.exe Data is 0.000161 $ ./DLTestNoAvg.exe Data is 0.000161 $ ./DLTestNoAvg.exe Data is 0.000161 $
With two calls and a sleep, the data looks valid but with just the one call and no sleep although the call succeeds, the data is WRONG.
Result on windows 2003 (correct value is close to 100%)
Two calls plus sleep as in NSClient with "Averages=true"
============================================= C:\Documents and Settings\Administrator>DLTestAvg.exe Data is 96.872520 C:\Documents and Settings\Administrator>DLTestAvg.exe Data is 95.310060 C:\Documents and Settings\Administrator>DLTestAvg.exe Data is 90.622680 C:\Documents and Settings\Administrator> C:\Documents and Settings\Administrator>
One call, no sleep as in NSClient with "Averages=false"
============================================= C:\Documents and Settings\Administrator>DLTestNoAvg.exe PdhGetFormattedCounterValue failed, status=-1073738810 Data is 0.000000 C:\Documents and Settings\Administrator>DLTestNoAvg.exe PdhGetFormattedCounterValue failed, status=-1073738810 Data is 0.000000 C:\Documents and Settings\Administrator>
lovedada06/09/10 18:49:16 (3 years ago)-
Message #1815
Ok, well at least we know what is the cause then...
Michael Medin
mickem06/09/10 19:22:07 (3 years ago)-
Message #1824
Yes.
Ideally, using the NRPE listener I'd be able to run one checkCounter command passing a list of say 100 counters over the socket and have NSClient++ collect the first pass for all 100, then sleep for 1 second, then collect the second pass for all 100, then return all the data in one big response.
We're currently collecting 20 or so counters once every 5 minutes, sending them over one at a time and having to spend at least 20 seconds doing it.
lovedada06/10/10 00:01:06 (3 years ago)-
Message #1825
That is doable I guess...
Please add this "feature request" as a ticket. As for the length of "NRPE" you could use alias to get around it... but! (big but here) internally NSClient++ also has limitations.
Michael Medin
mickem06/10/10 07:32:35 (3 years ago)-
Message #1827
Please see 387 and 388
lovedada06/10/10 14:35:32 (3 years ago)-
Message #1828
Not sure how alias helps BTW - we seem to be using it already but the issue is the packet size I think ?
lovedada06/10/10 14:36:18 (3 years ago)-
Message #1829
aliases (in nscp.ini) can be used to make aliases from nrpe commands to longer commands but that means you have to configure the counters in nsc.ini
Michael Medin
mickem06/10/10 18:20:31 (3 years ago)
-
-
-
-
-
-
-
-
-
-








