Opened 2 years ago
Last modified 6 months ago
#464 new defect
nsclient++ 0.3.8 and 0.3.9rc3 crash on exchange server.
| Reported by: | mariog | Owned by: | mickem |
|---|---|---|---|
| Priority: | 1 | Milestone: | 0.4.2 |
| Component: | CheckSystem | Version: | 0.3.9 |
| Severity: | Bugs | Keywords: | |
| Cc: |
Description (last modified by mickem)
Hello,
this began happening very often, we monitor a large number of performance counter on a windows 2008 64bits exchange 2010.
the nsclient crashes and we found this on the log:
011-06-24 00:05:28: error:modules\CheckSystem\CheckSystem.cpp:1115: ERROR: Failed to get mutex for PdhValidatePath (\MSExchangeAB\NSPI RPC Requests Average Latency|\MSExchangeAB\NSPI RPC Requests Average Latency) 2011-06-24 00:05:28: debug:NSClient++.cpp:1180: Injected Result: WARNING 'ERROR: Failed to get mutex for PdhValidatePath (\MSExchangeAB\NSPI RPC Requests Average Latency|\MSExchangeAB\NSPI RPC Requests Average Latency)' 2011-06-24 00:05:28: debug:NSClient++.cpp:1181: Injected Performance Result: ''
then the nsclient service stops.
the event viewer has this:
Log Name: Application
Source: Application Error
Date: 26/06/2011 23:59:52
Event ID: 1000
Task Category: (100)
Level: Error
Keywords: Classic
User: N/A
Computer: MAILSRV-01.snba.be
Description:
Faulting application name: nsclient++.exe, version: 0.0.0.0, time stamp: 0x4df77982
Faulting module name: CheckSystem.dll, version: 0.0.0.0, time stamp: 0x4df77a0d
Exception code: 0x40000015
Fault offset: 0x00000000000be22e
Faulting process id: 0x26e4
Faulting application start time: 0x01cc344c51c5cf27
Faulting application path: C:\Program Files\NSClient++\nsclient++.exe
Faulting module path: C:\Program Files\NSClient++\modules\CheckSystem.dll
Report Id: 9dbe14a7-a03f-11e0-8ea3-005056aa19a5
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Application Error" />
<EventID Qualifiers="0">1000</EventID>
<Level>2</Level>
<Task>100</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2011-06-26T21:59:52.000000000Z" />
<EventRecordID>225116</EventRecordID>
<Channel>Application</Channel>
<Computer>MAILSRV-01.snba.be</Computer>
<Security />
</System>
<EventData>
<Data>nsclient++.exe</Data>
<Data>0.0.0.0</Data>
<Data>4df77982</Data>
<Data>CheckSystem.dll</Data>
<Data>0.0.0.0</Data>
<Data>4df77a0d</Data>
<Data>40000015</Data>
<Data>00000000000be22e</Data>
<Data>26e4</Data>
<Data>01cc344c51c5cf27</Data>
<Data>C:\Program Files\NSClient++\nsclient++.exe</Data>
<Data>C:\Program Files\NSClient++\modules\CheckSystem.dll</Data>
<Data>9dbe14a7-a03f-11e0-8ea3-005056aa19a5</Data>
</EventData>
</Event>
Change History (6)
comment:1 Changed 15 months ago by mickem
- Milestone set to 0.4.1
comment:2 Changed 10 months ago by RandyJames
comment:3 Changed 10 months ago by mickem
- Description modified (diff)
Off the top of my head I could imagine this being related to broken performance counters (I know some HP counters has caused this in the past).
Can someone confirm if it is always a given counter or a given set of counters which causes this?
comment:4 Changed 10 months ago by mickem
Notice a potential work around for this would be to externalize the counter checking and run them "outside" of NSClient++ (which can be don with both 0.3.9 and 0.4.0)
comment:5 Changed 10 months ago by RandyJames
I'm not certain that the crash is related to the NSClient++ log entries.
While we have the same entries, the logs happen far more often than the crash.
For instance...
2012-07-28 03:57:50: error:modules\CheckSystem\PDHCollector.cpp:215: Failed to query performance counters: Failed to get mutex for PdhCollectQueryData 2012-07-28 03:57:56: error:modules\CheckSystem\PDHCollector.cpp:215: Failed to query performance counters: Failed to get mutex for PdhCollectQueryData 2012-07-28 03:58:02: error:modules\CheckSystem\PDHCollector.cpp:215: Failed to query performance counters: Failed to get mutex for PdhCollectQueryData
But the event log for the crash is a few minutes later.
7/28/2012 4:02:07 AM
And there are plenty of other identical entries in the NSClient++ log that don't result in a crash.
Also... if there is a piece of code that could result in an uncaught exception which is known... could it be moved inside of a try that would allow NSClient++ to catch it and auto-restart while dumping the debug information? Because... the important thing is that it keeps running... right?
comment:6 Changed 6 months ago by mickem
- Milestone changed from 0.4.1 to 0.4.2
please provide crash dump files from nsclient++









Similar log information here.
Version: 0.3.9.330 64-bit
I have NSClientpp setup to restart on crash...
But the restart doesn't happen and there are no crash dumps.
It seems there is a fault outside of the try{}
I can post more information if need be. The servers we have are all VMs under VMWare ESX. Not sure if that would be useful or not.