NSClient++ Help (#1) - CheckExternalScripts > Invalid return code: 128 (#849) - Message List
Hi, Iam really confused with this issue..
Iam using NSCA and Passive checks. But randomly Iam getting wrong results from external Batch file:
011-08-26 10:32:19: debug:NSClient++.cpp:1144: Injecting: check_ibm_raid: 2011-08-26 10:32:21: error:modules\CheckExternalScripts\CheckExternalScripts.cpp:214: The command (scripts\check_ibm_array.bat) returned an invalid return code: 128 2011-08-26 10:32:21: debug:NSClient++.cpp:1180: Injected Result: WARNING 'OK - IBM ServeRAID 8k-l Okay/Not Installed (LD1[1/Okay(0,0-OK 0,1-OK)])' 2011-08-26 10:32:21: debug:NSClient++.cpp:1181: Injected Performance Result: ''
The same situation for VBScript:
011-08-26 10:10:36: debug:NSClient++.cpp:1144: Injecting: check_hp_smartarray: 2011-08-26 10:10:37: error:modules\CheckExternalScripts\CheckExternalScripts.cpp:214: The command (cscript.exe) returned an invalid return code: 128 2011-08-26 10:10:37: debug:NSClient++.cpp:1180: Injected Result: WARNING 'OK - Smart Array E200i in Slot 0 OK/OK/OK (LD 1: OK [(1I:1:1 OK) (1I:1:2 OK)], LD 2: OK [(1I:1:4 OK) (2I:1:5 OK) (2I:1:6 OK) (2I:1:7 OK)])' 2011-08-26 10:10:37: debug:NSClient++.cpp:1181: Injected Performance Result: ''
Nagios reports: Service status: UNKNOWN Service info : OK - Smart Array E200i...
So, the script output is correct but the exit code is worng..
- Only bat/vbs commands are affected. Built-in commands like CheckDriveSpace?,CheckCPU,CheckMem works as well. (Never tried PS scripts..)
- Allow interract with Desktop is allowed for NSCLient service
- NSClient version 0.3.9.322 (upgraded from 0.3.6) Behaviour is same for both versions
- Platform: Windows Server 2003 SP2 / Windows Server 2003 x64 SP2
My check interval is set to 5min. And avg value of wrong results per hour is 4. So, 33% of results are wrong. :-O
Any idea how to prevent this situation? Can I use another method to run external scripts and send results to nsca daemon?
Many Thanks
-
Message #2287
Sounds like the script not ending "correctly". Could perhaps be timeouts or it could be errors in the script...
Michael Medin
mickem08/29/11 08:41:45 (22 months ago)-
Message #2289
but scripts are OK..
As I mentioned above, the scripts output is OK.. Just exit code is wrong.
I tried to debug my scripts. So, I put 2 timestamps (start, end) and exit code to external file before exiting Wscript.Quit(exitcode)
And the result? Execution time is less than 2sec, exit code is always set to 0. Event log in nagios reports numbers of unknown/OK state changes..
Can I change debug level to see more details? Or how can I get more data about this behaviour ?
kiklop08/29/11 15:09:15 (22 months ago)-
Message #2290
humm, strange... could you provide the script (or a script) which does this as well as the "frequency" and environment information?
Michael Medin
mickem08/29/11 15:31:03 (22 months ago)-
Message #2294
Sure, script is here : http://pc24.sk/files/nsc/check_hp_smartarray.vb_
External Command in NSC.INI
check_hp_smartarray =cscript.exe //T:30 //NoLogo scripts\check_hp_smartarray.vbs
NSCA check interval is set to 300s
here is a piece of event log from centreon for one of the affected host:
2011/08/30 08:13:04 INFUAKEVWS00001 HP_Smart_Array OK HARD 4 OK - Smart Array P410i in Slot 0 2011/08/30 08:03:04 INFUAKEVWS00001 HP_Smart_Array UNKNOWN HARD 4 OK - Smart Array P410i in Slot 0 2011/08/30 07:58:04 INFUAKEVWS00001 HP_Smart_Array UNKNOWN SOFT 3 OK - Smart Array P410i in Slot 0 2011/08/30 07:53:04 INFUAKEVWS00001 HP_Smart_Array UNKNOWN SOFT 2 OK - Smart Array P410i in Slot 0 2011/08/30 07:48:04 INFUAKEVWS00001 HP_Smart_Array UNKNOWN SOFT 1 OK - Smart Array P410i in Slot 0 2011/08/30 07:43:04 INFUAKEVWS00001 HP_Smart_Array OK SOFT 3 OK - Smart Array P410i in Slot 0 2011/08/30 07:38:04 INFUAKEVWS00001 HP_Smart_Array UNKNOWN SOFT 2 OK - Smart Array P410i in Slot 0 2011/08/30 07:33:04 INFUAKEVWS00001 HP_Smart_Array UNKNOWN SOFT 1 OK - Smart Array P410i in Slot 0
- the rest of monitored services work as well (no flapping)
Enironment:
- Windows server 2003 x64 SP2 EN
- NSClient++ x64 0.3.9.322
kiklop08/30/11 09:05:10 (22 months ago)-
Message #2308
Michael,
Can you help me?
I would like to modify CheckExternalScript? source code, but I don`t know how to compile it..
I would like to add a loop if wrong exit code is returned:
CheckExternalScripts.cpp from line 209
if (isAlias) { return NSCModuleHelper::InjectSplitAndCommand(cd.command, args, ' ', message, perf, true); } else { int result = process::executeProcess(root_, cd.command + _T(" ") + args, message, perf, timeout); int breakme; for (breakme = 1; i < 3; i++) { if (!NSCHelper::isNagiosReturnCode(result)) { NSC_LOG_ERROR_STD(_T("The command (") + cd.command + _T(") returned an invalid return code: ") + strEx::itos(result) + _T("Try no.:") + breakme); int result = process::executeProcess(root_, cd.command + _T(" ") + args, message, perf, timeout); }else{ return NSCHelper::int2nagios(result); break; } } return NSCAPI::returnUNKNOWN; }Iam not sure if this code is OK, but hope you will understand what I want :)
kiklop09/08/11 12:03:07 (22 months ago)
-
-
-
-
Message #2311
Hi, we are facing the same behaviour. This problem only affects our 64bit servers (Win2003 & R2, Win2008 & R2). First I suspected the scripts may be wrong but that is obviously not the problem. Propably there was a windows update causing this because this behaviour came up on different servers at about the same time. If I find some time, I try to investigate which updates where installed right before this UNKNOWN states started.
thanks
jauer09/16/11 10:25:43 (21 months ago)-
Message #2313
It seems to me that this is related to the script failing. I think the best aproach would be to run the script and see what it returns on the command line...
Michael Medin
mickem09/20/11 12:31:02 (21 months ago)-
Message #2319
I already did that.
All the scripts run as they should.
So the scripts return-code is wrong read or deployed by nsc++. I really think this is because of some windows-updates. I don't know if and which Windows components are used by nsc++, so I'm not sure.
jauer09/26/11 09:48:56 (21 months ago)-
Message #2321
I`m not sure, but.. can it be caused by C++ redist. package ? Microsoft released many packages, fixes..
kiklop09/29/11 13:14:51 (21 months ago) -
Message #2322
Humm... do you know which version causes it? Seince mamny have this I need to figure out what it is...
Also is there any 64->32 thing going on? (ie, is it a 32 bit ptogtam beeing launched from 64 bit nsclient++?)
Michael Medin
mickem09/30/11 12:22:19 (21 months ago)-
Message #2326
Hi Michael,
yes, I have x64 NSC++ and external applications are 32bits (for example HP ACU CLI utility - hpacucli.exe) I double check it, and all x86 hosts work as well.
For me, the best "qucik" solution is add a condition to source code. When exit code is 128, try to check it again (max attempts shuold be 3) If it fails after 3 retries, exit code will be 0 - critical.
kiklop10/05/11 09:02:15 (21 months ago) -
Message #2323
It shouldn't sicne nsclient++ is staticly linked...
Michael medin
mickem09/30/11 12:22:51 (21 months ago) -
Message #2328
What version do you mean? We have mutliple verisons of nsc++ running. Behaviour is the same. I try to find out which Windows updates may have caused this behaviour. I will post them as far as I know.
On some of our servers older 32bit nsc++-versions are installed and some have 64bit versions installed. But all servers run the same vbs-scripts. So the only programm beeing launched by nsc++ is cscript.exe. This is an 32bit programm but don't seem to be the reason for this problem.jauer10/06/11 16:50:42 (21 months ago)-
Message #2329
Ok, I probably found "temporary" solution..
Because installation of x86 on x64 system fails. I installed x64, stopped service and then I replaced all files with x86 version (copied from x86 server). My flapping service (unknown,Ok) is now stable for more than 10 hours.
kiklop10/07/11 08:50:26 (21 months ago)-
Message #2330
Thanks kiklop for your workaround, i'll try it on our systems.
Meanwhile I found out since when this UNKNOWN issue started. It was 4/8/2011 when it started to log those UNKNOWN states.
Next I figured out what win-patches concerned x64 server-systems at that time: KB2393802 KB2479943 KB2483614 KB2481109 KB971029
Maybe one of the developer might find something useful.
jauer10/07/11 10:37:19 (21 months ago)
-
-
-
-
-
-








