NSClient++ Help (#1) - CheckExternalScripts > Invalid return code: 128 (#849) - Message List

CheckExternalScripts > Invalid return code: 128

Hi, Iam really confused with this issue..

Iam using NSCA and Passive checks. But randomly Iam getting wrong results from external Batch file:

011-08-26 10:32:19: debug:NSClient++.cpp:1144: Injecting: check_ibm_raid:
2011-08-26 10:32:21: error:modules\CheckExternalScripts\CheckExternalScripts.cpp:214: The command (scripts\check_ibm_array.bat) returned an invalid return code: 128
2011-08-26 10:32:21: debug:NSClient++.cpp:1180: Injected Result: WARNING 'OK - IBM ServeRAID 8k-l Okay/Not Installed (LD1[1/Okay(0,0-OK 0,1-OK)])'
2011-08-26 10:32:21: debug:NSClient++.cpp:1181: Injected Performance Result: ''

The same situation for VBScript:

011-08-26 10:10:36: debug:NSClient++.cpp:1144: Injecting: check_hp_smartarray:
2011-08-26 10:10:37: error:modules\CheckExternalScripts\CheckExternalScripts.cpp:214: The command (cscript.exe) returned an invalid return code: 128
2011-08-26 10:10:37: debug:NSClient++.cpp:1180: Injected Result: WARNING 'OK - Smart Array E200i in Slot 0 OK/OK/OK (LD 1: OK [(1I:1:1 OK) (1I:1:2 OK)], LD 2: OK [(1I:1:4 OK) (2I:1:5 OK) (2I:1:6 OK) (2I:1:7 OK)])'
2011-08-26 10:10:37: debug:NSClient++.cpp:1181: Injected Performance Result: ''

Nagios reports: Service status: UNKNOWN Service info : OK - Smart Array E200i...

So, the script output is correct but the exit code is worng..

  • Only bat/vbs commands are affected. Built-in commands like CheckDriveSpace?,CheckCPU,CheckMem works as well. (Never tried PS scripts..)
  • Allow interract with Desktop is allowed for NSCLient service
  • NSClient version 0.3.9.322 (upgraded from 0.3.6) Behaviour is same for both versions
  • Platform: Windows Server 2003 SP2 / Windows Server 2003 x64 SP2

My check interval is set to 5min. And avg value of wrong results per hour is 4. So, 33% of results are wrong. :-O

Any idea how to prevent this situation? Can I use another method to run external scripts and send results to nsca daemon?

Many Thanks

  • Message #2287

    Sounds like the script not ending "correctly". Could perhaps be timeouts or it could be errors in the script...

    Michael Medin

    • Message #2289

      but scripts are OK..

      As I mentioned above, the scripts output is OK.. Just exit code is wrong.

      I tried to debug my scripts. So, I put 2 timestamps (start, end) and exit code to external file before exiting Wscript.Quit(exitcode)

      And the result? Execution time is less than 2sec, exit code is always set to 0. Event log in nagios reports numbers of unknown/OK state changes..

      Can I change debug level to see more details? Or how can I get more data about this behaviour ?

      • Message #2290

        humm, strange... could you provide the script (or a script) which does this as well as the "frequency" and environment information?

        Michael Medin

        • Message #2294

          Sure, script is here : http://pc24.sk/files/nsc/check_hp_smartarray.vb_

          External Command in NSC.INI

          check_hp_smartarray	=cscript.exe //T:30 //NoLogo scripts\check_hp_smartarray.vbs
          

          NSCA check interval is set to 300s

          here is a piece of event log from centreon for one of the affected host:

          2011/08/30 08:13:04 INFUAKEVWS00001 HP_Smart_Array OK HARD 4 OK - Smart Array P410i in Slot 0
          2011/08/30 08:03:04 INFUAKEVWS00001 HP_Smart_Array UNKNOWN HARD 4 OK - Smart Array P410i in Slot 0
          2011/08/30 07:58:04 INFUAKEVWS00001 HP_Smart_Array UNKNOWN SOFT 3 OK - Smart Array P410i in Slot 0
          2011/08/30 07:53:04 INFUAKEVWS00001 HP_Smart_Array UNKNOWN SOFT 2 OK - Smart Array P410i in Slot 0
          2011/08/30 07:48:04 INFUAKEVWS00001 HP_Smart_Array UNKNOWN SOFT 1 OK - Smart Array P410i in Slot 0
          2011/08/30 07:43:04 INFUAKEVWS00001 HP_Smart_Array OK SOFT 3 OK - Smart Array P410i in Slot 0
          2011/08/30 07:38:04 INFUAKEVWS00001 HP_Smart_Array UNKNOWN SOFT 2 OK - Smart Array P410i in Slot 0
          2011/08/30 07:33:04 INFUAKEVWS00001 HP_Smart_Array UNKNOWN SOFT 1 OK - Smart Array P410i in Slot 0
          
          • the rest of monitored services work as well (no flapping)

          Enironment:

          • Windows server 2003 x64 SP2 EN
          • NSClient++ x64 0.3.9.322
          • Message #2308

            Michael,

            Can you help me?

            I would like to modify CheckExternalScript? source code, but I don`t know how to compile it..

            I would like to add a loop if wrong exit code is returned:

            CheckExternalScripts.cpp from line 209

            if (isAlias) {
            		return NSCModuleHelper::InjectSplitAndCommand(cd.command, args, ' ', message, perf, true);
            	} else {
            		int result = process::executeProcess(root_, cd.command + _T(" ") + args, message, perf, timeout);
            		int breakme;
            		for (breakme = 1; i < 3; i++) {
            			if (!NSCHelper::isNagiosReturnCode(result)) {
            				NSC_LOG_ERROR_STD(_T("The command (") + cd.command + _T(") returned an invalid return code: ") + strEx::itos(result) + _T("Try no.:") + breakme);
            				int result = process::executeProcess(root_, cd.command + _T(" ") + args, message, perf, timeout);
            			}else{
            				return NSCHelper::int2nagios(result);
            				break;
            			}
            		}
            		return NSCAPI::returnUNKNOWN;
            	}
            

            Iam not sure if this code is OK, but hope you will understand what I want :)

  • Message #2311

    Hi, we are facing the same behaviour. This problem only affects our 64bit servers (Win2003 & R2, Win2008 & R2). First I suspected the scripts may be wrong but that is obviously not the problem. Propably there was a windows update causing this because this behaviour came up on different servers at about the same time. If I find some time, I try to investigate which updates where installed right before this UNKNOWN states started.

    thanks

    • Message #2313

      It seems to me that this is related to the script failing. I think the best aproach would be to run the script and see what it returns on the command line...

      Michael Medin

      • Message #2319

        I already did that.

        All the scripts run as they should.

        So the scripts return-code is wrong read or deployed by nsc++. I really think this is because of some windows-updates. I don't know if and which Windows components are used by nsc++, so I'm not sure.

        • Message #2321

          I`m not sure, but.. can it be caused by C++ redist. package ? Microsoft released many packages, fixes..

        • Message #2322

          Humm... do you know which version causes it? Seince mamny have this I need to figure out what it is...

          Also is there any 64->32 thing going on? (ie, is it a 32 bit ptogtam beeing launched from 64 bit nsclient++?)

          Michael Medin

          • Message #2326

            Hi Michael,

            yes, I have x64 NSC++ and external applications are 32bits (for example HP ACU CLI utility - hpacucli.exe) I double check it, and all x86 hosts work as well.

            For me, the best "qucik" solution is add a condition to source code. When exit code is 128, try to check it again (max attempts shuold be 3) If it fails after 3 retries, exit code will be 0 - critical.

          • Message #2323

            It shouldn't sicne nsclient++ is staticly linked...

            Michael medin

            • Message #2325

              we have also the same behaviour on 64 bits windows servers. The scripts are absolutely good. I have changed one script to return just the exit code for OK. Even then I get the UNKNOWN status.

          • Message #2328

            What version do you mean? We have mutliple verisons of nsc++ running. Behaviour is the same. I try to find out which Windows updates may have caused this behaviour. I will post them as far as I know.
            On some of our servers older 32bit nsc++-versions are installed and some have 64bit versions installed. But all servers run the same vbs-scripts. So the only programm beeing launched by nsc++ is cscript.exe. This is an 32bit programm but don't seem to be the reason for this problem.

            • Message #2329

              Ok, I probably found "temporary" solution..

              Because installation of x86 on x64 system fails. I installed x64, stopped service and then I replaced all files with x86 version (copied from x86 server). My flapping service (unknown,Ok) is now stable for more than 10 hours.

              • Message #2330

                Thanks kiklop for your workaround, i'll try it on our systems.
                Meanwhile I found out since when this UNKNOWN issue started. It was 4/8/2011 when it started to log those UNKNOWN states.
                Next I figured out what win-patches concerned x64 server-systems at that time: KB2393802 KB2479943 KB2483614 KB2481109 KB971029

                Maybe one of the developer might find something useful.

Subscriptions