- Timestamp:
-
01/19/09 08:48:09 (4 years ago)
- Author:
-
mickem
- Comment:
-
--
Legend:
- Unmodified
- Added
- Removed
- Modified
-
|
v1
|
v2
|
|
| 3 | 3 | Nagios is a very powerful platform because it is easy to extend. A great feature that Nagios offers is the ability for third-party software or other Nagios instances to report information on the status of services or hosts. This way, Nagios does not need to schedule and run checks by itself, but other applications can report information as it is available to them. This means that your applications can send problem reports directly to Nagios, instead of just logging them. In this way, your applications can benefit from powerful notification systems as well as dependency tracking. In this article by <b>Wojciech Kocjan</b>, we will see how this mechanism can also be used to receive failure notifications from other services or machines—for example, SNMP traps. |
| 4 | 4 | |
| 5 | | {literal}<p>Nagios also offers a tool for sending passive check results for hosts and services over a network. It is called <b>NSCA (Nagios Service Check Acceptor)</b>. It can be used to send results from one Nagios instance to another. It can also be used by third-party applications running on different machines to send passive check results to a central Nagios server.</p><p>This mechanism includes password protection, along with encryption, to prevent injection of false results in to Nagios. In this way, NSCA communication sent over the Internet is more secure.</p><h1>What are Passive Checks?</h1><p>Previous parts of this book often mentioned Nagios performing checks on various software and machines. In such cases, Nagios decides when a check is to be performed, runs the check and stores the result. These types of checks are called <b>Active Checks</b>.</p><p>Nagios also offers another way to work with the statuses of hosts and services. It is possible to configure Nagios so that it will receive status information sent over a command pipe. In such a case, checks are done by other programs, and their results are sent to Nagios. Nagios will still handle all notifications, event handlers, and dependencies between hosts and services.</p><p>Active checks are most common in the Nagios world. They have a lot of advantages and some disadvantages. One of the problems is that such checks can take only a couple of seconds to complete—a typical timeout for an active check to complete is 10 or 30 seconds. In many cases, the time taken is not enough, as some checks need to be performed over a longer period of time to have satisfactory results. A good example might be running a check that takes several hours to complete—in this case, it does not make sense to raise the global <i>service_check_timeout</i> option, but rather to schedule these checks outside of Nagios and only report the results back to it.</p><p>There are also different types of checks including external applications or devices that want to report information directly to Nagios. This can be done to gather all critical errors to a single, central place. These types of checks are called <b>Passive Checks</b>.</p><p>For example, when a web application cannot connect to the database, it will let Nagios know about it immediately. It can also send reports after a database recovery, or periodically, even if connectivity to the database has been consistently available, so that Nagios has an up-to-date status. This can be done in addition to active checks, to identify critical problems earlier.</p><p>Another example is where an application already processes information such as network bandwidth utilization. In such a case, adding a module that reports current utilization along with the <i>OK/WARNING/CRITICAL</i> state to Nagios seems much easier than using active checks for the same job.</p><p>Often, there are situations where active checks obviously fit better. In other cases, passive checks are the way to go. In general, if a check can be done quickly and does not require long running processes, it should definitely be done as an active service. If the situation involves reporting problems from other applications or machines, it is definitely a use case for a passive check. In cases where the checks require the deployment of long-running processes or monitoring information constantly, this should be done as a passive service.</p><p>Another difference is that active checks require much less effort to be set up when compared to passive checks. In the first case, Nagios takes care of the scheduling, and the command only needs to perform the actual checks and mark the results as <i>OK/WARNING/CRITICAL</i> based on how a check command is configured. Passive checks require all the logic related to what should be reported and when it should be checked to be put in an external application. This usually calls for some effort.</p><p>The following diagram shows how both active and passive checks are performed by Nagios. It shows what is performed by Nagios in both cases and what needs to be done by the check command or an external application for passive checks.</p><p style="text-align: center;"><img src="http://www.packtpub.com/files/images/nagios-article-image1.PNG"></p><p>Nagios also offers a way of combining the benefits of both active and passive checks. Often, you have situations where other applications can report if a certain service is working properly or not. But if the monitoring application is not running or some other issue prevents it from reporting, Nagios can use active checks to keep the service status up–to-date.</p><p>A good example would be a server that is a part of an application, processing job queues using a database. It can report each problem when accessing the database. We want Nagios to monitor this database, and as the application is already using it, we can add a module that reports this to Nagios.</p><p>The application can also periodically let Nagios know if it succeeded in using the database without problems. However, if there are no jobs to process and the application is not using it, Nagios will not have up-to-date information about the database.</p><h1>Configuring Passive Checks</h1><p>The first thing that needs to be done in order to use passive checks for your Nagios setup is to make sure that you have the following options in your main Nagios configuration file:</p><pre style="margin-left: 40px;">accept_passive_service_checks=1<br>accept_passive_host_checks=1<br></pre><p>It would also be good to enable the logging of incoming passive checks—this makes determining the problem of not processing a passive check much easier. The following directive allows it:</p><pre style="margin-left: 40px;">log_passive_checks=1<br></pre><p>Setting up hosts or services for passive checking requires an object to be defined and set up so as not to perform active checks. The object needs to have the <i>passive_checks_enabled</i> option set to 1 for Nagios to accept passive check results over the command pipe.</p><p>The following is an example of the required configuration for a host that accepts passive checks and has active checks disabled:</p><pre style="margin-left: 40px;">define host<br>{<br>use generic-host<br>host_name linuxbox01<br>address 10.0.2.1<br>active_checks_enabled 0<br>passive_checks_enabled 1 <br>}<br></pre><p>Configuring services is exactly the same as with hosts. For example, to set up a very similar service, all we need to do is to use the same parameters as those for the hosts:</p><pre style="margin-left: 40px;">define service<br>{<br>use ping-template<br>host_name linuxbox01<br>service_description PING<br>active_checks_enabled 0<br>passive_checks_enabled 1<br>}<br></pre><p>In this case, Nagios will never perform any active checks on its own and will only rely on the results that are passed to it.</p><p>We can also configure Nagios so that if no new information has been provided within a certain period of time, it will use active checks to get the current status of the host or service. If up-to-date information has been provided by a passive check during this period, then it will not perform active checks.</p><p>In order to do this, we need to enable active checks by setting the <i>active_checks_enabled</i> option to 1 without specifying the <i>normal_check_interval</i> directive. For Nagios to perform active checks when there is no up-to-date result from passive checks, you need to set the <i>check_freshness</i> directive to 1 and set <i>freshness_threshold</i> to the time period after which a check should be performed. The time performed is specified in seconds.</p><p>The first parameter tells Nagios that it should check whether the results from the checks are up-to-date. The next parameter specifies the number of seconds after which Nagios should consider the results to be out of date. Attributes can be used for both hosts and services.</p><p>A sample definition for a host that runs an active check if there has been no result provided within the last two hours:</p><pre style="margin-left: 40px;">define host<br>{<br>use generic-host<br>host_name linuxbox02<br>address 10.0.2.2<br>check_command check-host-alive<br>check_freshness 1<br>freshness_threshold 7200<br>active_checks_enabled 1<br>passive_checks_enabled 1<br>}<br></pre><p>The following is an illustration showing when Nagios would invoke active checks:</p><p style="text-align: center;"><img src="http://www.packtpub.com/files/images/nagios-article-image2.PNG"></p><p>Each time there is at least one passive check result that is still valid (i.e., was received within the past two hours), Nagios will not perform any active checks. However, two hours after the last passive or active check result was received, Nagios would perform an active check to keep the results up-to-date.</p><h1>Passive Checks—Hosts</h1><p>Nagios allows applications and event handlers to send out passive check results for host objects. In order to use them, the host needs to be configured to accept passive checks results.</p><p>In order to be able to submit passive check results, we need to configure Nagios to allow the sending of passive check results, and set the host objects to accept them.</p><p>Submitting passive host check results to Nagios requires sending a command to the Nagios external command pipe. This way, the other applications on your Nagios server can report the status of the hosts.</p><p>The command to submit passive checks is <i>PROCESS_HOST_CHECK_RESULT</i> (visit <i>http://www.nagios.org/developerinfo/externalcommands/commandinfo.php?command_id=115</i>). This command accepts the host name, status code, and the textual output from a check. The host status code should be 0 for an <i>UP</i> state, 1 for <i>DOWN</i> and 2 for an <i>UNREACHABLE</i> state.</p><p>The following is a sample script that will accept the host name, status code, and output from a check and will submit these to Nagios:</p><pre style="margin-left: 40px;">#!/bin/sh<br><br>NOW='date +%s'<br>HOST=$1<br>STATUS=$2<br>OUTPUT=$3<br><br>echo "[$NOW] PROCESS_HOST_CHECK_RESULT;$HOST;$STATUS;$OUTPUT" <br> >/var/nagios/rw/nagios.cmd<br><br>exit 0<br></pre><p>As an example of the use of this script, the command that is sent to Nagios for <i>host01</i>, status code 2 (<i>UNREACHABLE</i>) and output <i>router 192.168.1.2</i> down would be as follows:</p><pre style="margin-left: 40px;">[1206096000] PROCESS_HOST_CHECK_RESULT;host01;2;router<br>192.168.1.2 down<br></pre><p>When submitting results, it is worth noting that Nagios might take some time to process them, depending on the intervals between Nagios' checks of the external command pipe.</p><p>Unlike active checks, Nagios will not take network topology into consideration by default. This is very important in situations where a host behind a router is reported to be down because the router is actually down.</p><p>By default, Nagios handles results from active and passive checks differently. When Nagios plans and receives results from active checks, it takes the actual network topology into consideration and performs a translation of the states based on this. This means that if Nagios receives a result indicating that a host is <i>DOWN</i>, it assumes that all child hosts are in an <i>UNREACHABLE</i> state.</p><p>When a passive result check comes in to Nagios, Nagios expects that the result already has a network topology included. When a host is reported to be <i>DOWN</i> as a passive check result, Nagios does not perform a translation from <i>DOWN</i> to <i>UNREACHABLE</i>. Even if its parent host is currently <i>DOWN</i>, the child host state is also stored as <i>DOWN</i>.</p><p>The following illustration shows how results from active and passive checks are treated differently by Nagios:</p><p style="text-align: center;"><img src="http://www.packtpub.com/files/images/nagios-article-image3.PNG"></p><br><br><hr size="1" color="#ff9933" noshade="noshade"><br><div class="header">Learning Nagios 3.0</div><div style="line-height: 0.4em;"> </div> <table width="100%" cellpadding="0" cellspacing="0"><tbody><tr><td valign="top" width="99"> <a href="http://www.packtpub.com/guide-for-learning-nagios-3/book/bh/nagios-abr/1108"><img title="Learning Nagios 3.0" class="left" alt="Learning Nagios 3.0" src="http://images.packtpub.com/images/100x123/1847195180.png" width="99" border="0" height="123"></a> </td> <td valign="top">A comprehensive configuration guide to monitor and maintain your network and systems <ul><li>Secure and monitor your network system with open-source Nagios version 3</li><li>Set up, configure, and manage the latest version of Nagios</li><li>In-depth coverage for both beginners and advanced users</li></ul> <a href="http://www.packtpub.com/guide-for-learning-nagios-3/book">http://www.PacktPub.com/passive-checks-nsca-nagios-service-check-acceptor</a></td></tr></tbody></table><br><hr size="1" color="#ff9933" noshade="noshade"><p>In both the cases, a check result stating that the host is down is received by Nagios. When it comes in as a passive check, no state translation is done and Nagios stores the host and all child nodes being down. When it is an active check result, Nagios takes the fact that <i>switch1</i> is down into account and maps the child node's result into an <i>UNREACHABLE</i> state.</p><p>How Nagios process handles passive check results can be defined in the main Nagios configuration file. In order to make Nagios treat passive host check results in the same way as active check results, we need to enable the following option:</p><pre style="margin-left: 40px;">translate_passive_host_checks=1</pre><p>By default, Nagios treats host results from passive checks as hard results. This is because, very often, passive checks are used to report host and service statuses from other Nagios instances. In such cases, only reports regarding hard state changes are propagated across Nagios servers.</p><p>If you want Nagios to treat all passive check results for hosts as if they were soft results, you need to enable the following option in the main Nagios configuration file:</p><pre style="margin-left: 40px;">passive_host_checks_are_soft=1<br></pre><h1>Passive Checks—Services</h1><p>Passive service checks are very similar to passive host checks. In both the cases, the idea is that Nagios receives information about host statuses over the external commands pipe.</p><p>As with passive checks of hosts, all that is needed is to enable the global Nagios option to accept passive check results, and also enable this option for each service that should allow the passing of passive check results.</p><p>The results are passed to Nagios in the same way as they are passed for hosts. A command to submit passive checks is <i>PROCESS_SERVICE_CHECK_RESULT</i> (visit <a target="_blank" href="" http:="" www.nagios.org="" developerinfo="" externalcommands="" commandinfo.php?command_id="114"">http://www.nagios.org/developerinfo/externalcommands/commandinfo.php?command_id=114</a>). This command accepts the host name, service description, status code, and the textual output from a check. Service status codes are the same as those for active checks—0 for <i>OK</i>, 1 for <i>WARNING</i>, 2 for <i>CRITICAL</i>, and 3 for an <i>UNKNOWN</i> state.</p><p>The following is a sample script that will accept the host name, status code, and output from a check and will submit these to Nagios:</p><pre style="margin-left: 40px;">#!/bin/sh<br><br>CLOCK='date +%s'<br>HOST=$1<br>SVC=$2<br>STATUS=$3<br>OUTPUT=$4<br><br>echo "[$CLOCK] PROCESS_SERVICE_CHECK_RESULT;$HOST;$SVC;$STATUS;<br> $OUTPUT"<br> >/var/nagios/rw/nagios.cmd<br><br>exit 0<br></pre><p>As an example of the use of this script, the command that is sent to Nagios for <i>host01</i>, service <i>PING</i>, status code 0 (<i>OK</i>) and output <i>RTT=57 ms</i> is as follows:</p><pre style="margin-left: 40px;">[1206096000] PROCESS_SERVICE_CHECK_RESULT;host01;PING;0;RTT=57 ms<br></pre><p>A very common scenario for using passive checks is a check that takes a very long time to complete.</p><p>As with submitting host check results, it is worth mentioning that Nagios will take some time to process passive check results as they are polled periodically from the external commands pipe.</p><p>A major difference between hosts and services is that service checks differentiate between soft and hard states. When new information regarding a service gets passed to Nagios via the external commands pipe, Nagios treats it the same way as if it had been received by an active check.</p><p>If a service is set up with a <i>max_check_attempts</i> directive of 5, then the same number of passive check results would need to be passed in order for Nagios to treat the new status as a hard state change.</p><p>Passive service checks are often used to report the results of long lasting tests asynchronously. A good example of such a test is checking whether there are bad blocks on a disk. This requires trying to read the entire disk directly from the block device (such as <i>/dev/sda1</i>) and checking if the attempt has failed. This can't be done as an active check as reading the device takes a lot of time to complete—larger disks might require several hours to complete.</p><p>For this reason, the only way to perform such a check is to schedule them from the system—for example, using the <i>cron</i> daemon (visit <i>http://man.linuxquestions.org/index.php?query=cron</i>). The script should then post results to the Nagios daemon.</p><p>The following is a script that runs the <i>dd</i> system command (visit <i>http://man.linuxquestions.org/index.php?query=dd</i>) to read an entire block device. Based on whether the read was successful or not, the appropriate status code, along with plugin output, is sent out.</p><pre style="margin-left: 40px;">#!/bin/sh<br><br> SVC=$1<br> DEVICE=$2<br> TMPFILE=/tmp/ddlog.$$<br> NOW='date +%s'<br> PREFIX="['date +%s'] [$NOW] PROCESS_SERVICE_CHECK_<br>RESULT;localhost;$SVC"<br> <br> # try to read the device<br> dd if=$DEVICE of=/dev/null >$TMPFILE 2>&1<br> CODE=$?<br> RESULT='grep copied <$TMPFILE'<br> rm $TMPFILE<br><br> if [ $CODE == 0 ] ; then<br> echo "$PREFIX;0;$RESULT"<br> else<br> echo "$PREFIX;2;Error while checking device $DEVICE"<br> fi<br><br> exit 0<br></pre> |
| | 5 | <p>Nagios also offers a tool for sending passive check results for hosts and services over a network. It is called <b>NSCA (Nagios Service Check Acceptor)</b>. It can be used to send results from one Nagios instance to another. It can also be used by third-party applications running on different machines to send passive check results to a central Nagios server.</p><p>This mechanism includes password protection, along with encryption, to prevent injection of false results in to Nagios. In this way, NSCA communication sent over the Internet is more secure.</p><h1>What are Passive Checks?</h1><p>Previous parts of this book often mentioned Nagios performing checks on various software and machines. In such cases, Nagios decides when a check is to be performed, runs the check and stores the result. These types of checks are called <b>Active Checks</b>.</p><p>Nagios also offers another way to work with the statuses of hosts and services. It is possible to configure Nagios so that it will receive status information sent over a command pipe. In such a case, checks are done by other programs, and their results are sent to Nagios. Nagios will still handle all notifications, event handlers, and dependencies between hosts and services.</p><p>Active checks are most common in the Nagios world. They have a lot of advantages and some disadvantages. One of the problems is that such checks can take only a couple of seconds to complete—a typical timeout for an active check to complete is 10 or 30 seconds. In many cases, the time taken is not enough, as some checks need to be performed over a longer period of time to have satisfactory results. A good example might be running a check that takes several hours to complete—in this case, it does not make sense to raise the global <i>service_check_timeout</i> option, but rather to schedule these checks outside of Nagios and only report the results back to it.</p><p>There are also different types of checks including external applications or devices that want to report information directly to Nagios. This can be done to gather all critical errors to a single, central place. These types of checks are called <b>Passive Checks</b>.</p><p>For example, when a web application cannot connect to the database, it will let Nagios know about it immediately. It can also send reports after a database recovery, or periodically, even if connectivity to the database has been consistently available, so that Nagios has an up-to-date status. This can be done in addition to active checks, to identify critical problems earlier.</p><p>Another example is where an application already processes information such as network bandwidth utilization. In such a case, adding a module that reports current utilization along with the <i>OK/WARNING/CRITICAL</i> state to Nagios seems much easier than using active checks for the same job.</p><p>Often, there are situations where active checks obviously fit better. In other cases, passive checks are the way to go. In general, if a check can be done quickly and does not require long running processes, it should definitely be done as an active service. If the situation involves reporting problems from other applications or machines, it is definitely a use case for a passive check. In cases where the checks require the deployment of long-running processes or monitoring information constantly, this should be done as a passive service.</p><p>Another difference is that active checks require much less effort to be set up when compared to passive checks. In the first case, Nagios takes care of the scheduling, and the command only needs to perform the actual checks and mark the results as <i>OK/WARNING/CRITICAL</i> based on how a check command is configured. Passive checks require all the logic related to what should be reported and when it should be checked to be put in an external application. This usually calls for some effort.</p><p>The following diagram shows how both active and passive checks are performed by Nagios. It shows what is performed by Nagios in both cases and what needs to be done by the check command or an external application for passive checks.</p><p style="text-align: center;"><img src="http://www.packtpub.com/files/images/nagios-article-image1.PNG"></p><p>Nagios also offers a way of combining the benefits of both active and passive checks. Often, you have situations where other applications can report if a certain service is working properly or not. But if the monitoring application is not running or some other issue prevents it from reporting, Nagios can use active checks to keep the service status up–to-date.</p><p>A good example would be a server that is a part of an application, processing job queues using a database. It can report each problem when accessing the database. We want Nagios to monitor this database, and as the application is already using it, we can add a module that reports this to Nagios.</p><p>The application can also periodically let Nagios know if it succeeded in using the database without problems. However, if there are no jobs to process and the application is not using it, Nagios will not have up-to-date information about the database.</p><h1>Configuring Passive Checks</h1><p>The first thing that needs to be done in order to use passive checks for your Nagios setup is to make sure that you have the following options in your main Nagios configuration file:</p><pre style="margin-left: 40px;">accept_passive_service_checks=1<br>accept_passive_host_checks=1<br></pre><p>It would also be good to enable the logging of incoming passive checks—this makes determining the problem of not processing a passive check much easier. The following directive allows it:</p><pre style="margin-left: 40px;">log_passive_checks=1<br></pre><p>Setting up hosts or services for passive checking requires an object to be defined and set up so as not to perform active checks. The object needs to have the <i>passive_checks_enabled</i> option set to 1 for Nagios to accept passive check results over the command pipe.</p><p>The following is an example of the required configuration for a host that accepts passive checks and has active checks disabled:</p><pre style="margin-left: 40px;">define host<br>{<br>use generic-host<br>host_name linuxbox01<br>address 10.0.2.1<br>active_checks_enabled 0<br>passive_checks_enabled 1 <br>}<br></pre><p>Configuring services is exactly the same as with hosts. For example, to set up a very similar service, all we need to do is to use the same parameters as those for the hosts:</p><pre style="margin-left: 40px;">define service<br>{<br>use ping-template<br>host_name linuxbox01<br>service_description PING<br>active_checks_enabled 0<br>passive_checks_enabled 1<br>}<br></pre><p>In this case, Nagios will never perform any active checks on its own and will only rely on the results that are passed to it.</p><p>We can also configure Nagios so that if no new information has been provided within a certain period of time, it will use active checks to get the current status of the host or service. If up-to-date information has been provided by a passive check during this period, then it will not perform active checks.</p><p>In order to do this, we need to enable active checks by setting the <i>active_checks_enabled</i> option to 1 without specifying the <i>normal_check_interval</i> directive. For Nagios to perform active checks when there is no up-to-date result from passive checks, you need to set the <i>check_freshness</i> directive to 1 and set <i>freshness_threshold</i> to the time period after which a check should be performed. The time performed is specified in seconds.</p><p>The first parameter tells Nagios that it should check whether the results from the checks are up-to-date. The next parameter specifies the number of seconds after which Nagios should consider the results to be out of date. Attributes can be used for both hosts and services.</p><p>A sample definition for a host that runs an active check if there has been no result provided within the last two hours:</p><pre style="margin-left: 40px;">define host<br>{<br>use generic-host<br>host_name linuxbox02<br>address 10.0.2.2<br>check_command check-host-alive<br>check_freshness 1<br>freshness_threshold 7200<br>active_checks_enabled 1<br>passive_checks_enabled 1<br>}<br></pre><p>The following is an illustration showing when Nagios would invoke active checks:</p><p style="text-align: center;"><img src="http://www.packtpub.com/files/images/nagios-article-image2.PNG"></p><p>Each time there is at least one passive check result that is still valid (i.e., was received within the past two hours), Nagios will not perform any active checks. However, two hours after the last passive or active check result was received, Nagios would perform an active check to keep the results up-to-date.</p><h1>Passive Checks—Hosts</h1><p>Nagios allows applications and event handlers to send out passive check results for host objects. In order to use them, the host needs to be configured to accept passive checks results.</p><p>In order to be able to submit passive check results, we need to configure Nagios to allow the sending of passive check results, and set the host objects to accept them.</p><p>Submitting passive host check results to Nagios requires sending a command to the Nagios external command pipe. This way, the other applications on your Nagios server can report the status of the hosts.</p><p>The command to submit passive checks is <i>PROCESS_HOST_CHECK_RESULT</i> (visit <i>http://www.nagios.org/developerinfo/externalcommands/commandinfo.php?command_id=115</i>). This command accepts the host name, status code, and the textual output from a check. The host status code should be 0 for an <i>UP</i> state, 1 for <i>DOWN</i> and 2 for an <i>UNREACHABLE</i> state.</p><p>The following is a sample script that will accept the host name, status code, and output from a check and will submit these to Nagios:</p><pre style="margin-left: 40px;">#!/bin/sh<br><br>NOW='date +%s'<br>HOST=$1<br>STATUS=$2<br>OUTPUT=$3<br><br>echo "[$NOW] PROCESS_HOST_CHECK_RESULT;$HOST;$STATUS;$OUTPUT" <br> >/var/nagios/rw/nagios.cmd<br><br>exit 0<br></pre><p>As an example of the use of this script, the command that is sent to Nagios for <i>host01</i>, status code 2 (<i>UNREACHABLE</i>) and output <i>router 192.168.1.2</i> down would be as follows:</p><pre style="margin-left: 40px;">[1206096000] PROCESS_HOST_CHECK_RESULT;host01;2;router<br>192.168.1.2 down<br></pre><p>When submitting results, it is worth noting that Nagios might take some time to process them, depending on the intervals between Nagios' checks of the external command pipe.</p><p>Unlike active checks, Nagios will not take network topology into consideration by default. This is very important in situations where a host behind a router is reported to be down because the router is actually down.</p><p>By default, Nagios handles results from active and passive checks differently. When Nagios plans and receives results from active checks, it takes the actual network topology into consideration and performs a translation of the states based on this. This means that if Nagios receives a result indicating that a host is <i>DOWN</i>, it assumes that all child hosts are in an <i>UNREACHABLE</i> state.</p><p>When a passive result check comes in to Nagios, Nagios expects that the result already has a network topology included. When a host is reported to be <i>DOWN</i> as a passive check result, Nagios does not perform a translation from <i>DOWN</i> to <i>UNREACHABLE</i>. Even if its parent host is currently <i>DOWN</i>, the child host state is also stored as <i>DOWN</i>.</p><p>The following illustration shows how results from active and passive checks are treated differently by Nagios:</p><p style="text-align: center;"><img src="http://www.packtpub.com/files/images/nagios-article-image3.PNG"></p><br><br><hr size="1" color="#ff9933" noshade="noshade"><br><div class="header">Learning Nagios 3.0</div><div style="line-height: 0.4em;"> </div> <table width="100%" cellpadding="0" cellspacing="0"><tbody><tr><td valign="top" width="99"> <a href="http://www.packtpub.com/guide-for-learning-nagios-3/book/bh/nagios-abr/1108"><img title="Learning Nagios 3.0" class="left" alt="Learning Nagios 3.0" src="http://images.packtpub.com/images/100x123/1847195180.png" width="99" border="0" height="123"></a> </td> <td valign="top">A comprehensive configuration guide to monitor and maintain your network and systems <ul><li>Secure and monitor your network system with open-source Nagios version 3</li><li>Set up, configure, and manage the latest version of Nagios</li><li>In-depth coverage for both beginners and advanced users</li></ul> <a href="http://www.packtpub.com/guide-for-learning-nagios-3/book">http://www.PacktPub.com/passive-checks-nsca-nagios-service-check-acceptor</a></td></tr></tbody></table><br><hr size="1" color="#ff9933" noshade="noshade"><p>In both the cases, a check result stating that the host is down is received by Nagios. When it comes in as a passive check, no state translation is done and Nagios stores the host and all child nodes being down. When it is an active check result, Nagios takes the fact that <i>switch1</i> is down into account and maps the child node's result into an <i>UNREACHABLE</i> state.</p><p>How Nagios process handles passive check results can be defined in the main Nagios configuration file. In order to make Nagios treat passive host check results in the same way as active check results, we need to enable the following option:</p><pre style="margin-left: 40px;">translate_passive_host_checks=1</pre><p>By default, Nagios treats host results from passive checks as hard results. This is because, very often, passive checks are used to report host and service statuses from other Nagios instances. In such cases, only reports regarding hard state changes are propagated across Nagios servers.</p><p>If you want Nagios to treat all passive check results for hosts as if they were soft results, you need to enable the following option in the main Nagios configuration file:</p><pre style="margin-left: 40px;">passive_host_checks_are_soft=1<br></pre><h1>Passive Checks—Services</h1><p>Passive service checks are very similar to passive host checks. In both the cases, the idea is that Nagios receives information about host statuses over the external commands pipe.</p><p>As with passive checks of hosts, all that is needed is to enable the global Nagios option to accept passive check results, and also enable this option for each service that should allow the passing of passive check results.</p><p>The results are passed to Nagios in the same way as they are passed for hosts. A command to submit passive checks is <i>PROCESS_SERVICE_CHECK_RESULT</i> (visit <a target="_blank" href="" http:="" www.nagios.org="" developerinfo="" externalcommands="" commandinfo.php?command_id="114"">http://www.nagios.org/developerinfo/externalcommands/commandinfo.php?command_id=114</a>). This command accepts the host name, service description, status code, and the textual output from a check. Service status codes are the same as those for active checks—0 for <i>OK</i>, 1 for <i>WARNING</i>, 2 for <i>CRITICAL</i>, and 3 for an <i>UNKNOWN</i> state.</p><p>The following is a sample script that will accept the host name, status code, and output from a check and will submit these to Nagios:</p><pre style="margin-left: 40px;">#!/bin/sh<br><br>CLOCK='date +%s'<br>HOST=$1<br>SVC=$2<br>STATUS=$3<br>OUTPUT=$4<br><br>echo "[$CLOCK] PROCESS_SERVICE_CHECK_RESULT;$HOST;$SVC;$STATUS;<br> $OUTPUT"<br> >/var/nagios/rw/nagios.cmd<br><br>exit 0<br></pre><p>As an example of the use of this script, the command that is sent to Nagios for <i>host01</i>, service <i>PING</i>, status code 0 (<i>OK</i>) and output <i>RTT=57 ms</i> is as follows:</p><pre style="margin-left: 40px;">[1206096000] PROCESS_SERVICE_CHECK_RESULT;host01;PING;0;RTT=57 ms<br></pre><p>A very common scenario for using passive checks is a check that takes a very long time to complete.</p><p>As with submitting host check results, it is worth mentioning that Nagios will take some time to process passive check results as they are polled periodically from the external commands pipe.</p><p>A major difference between hosts and services is that service checks differentiate between soft and hard states. When new information regarding a service gets passed to Nagios via the external commands pipe, Nagios treats it the same way as if it had been received by an active check.</p><p>If a service is set up with a <i>max_check_attempts</i> directive of 5, then the same number of passive check results would need to be passed in order for Nagios to treat the new status as a hard state change.</p><p>Passive service checks are often used to report the results of long lasting tests asynchronously. A good example of such a test is checking whether there are bad blocks on a disk. This requires trying to read the entire disk directly from the block device (such as <i>/dev/sda1</i>) and checking if the attempt has failed. This can't be done as an active check as reading the device takes a lot of time to complete—larger disks might require several hours to complete.</p><p>For this reason, the only way to perform such a check is to schedule them from the system—for example, using the <i>cron</i> daemon (visit <i>http://man.linuxquestions.org/index.php?query=cron</i>). The script should then post results to the Nagios daemon.</p><p>The following is a script that runs the <i>dd</i> system command (visit <i>http://man.linuxquestions.org/index.php?query=dd</i>) to read an entire block device. Based on whether the read was successful or not, the appropriate status code, along with plugin output, is sent out.</p><pre style="margin-left: 40px;">#!/bin/sh<br><br> SVC=$1<br> DEVICE=$2<br> TMPFILE=/tmp/ddlog.$$<br> NOW='date +%s'<br> PREFIX="['date +%s'] [$NOW] PROCESS_SERVICE_CHECK_<br>RESULT;localhost;$SVC"<br> <br> # try to read the device<br> dd if=$DEVICE of=/dev/null >$TMPFILE 2>&1<br> CODE=$?<br> RESULT='grep copied <$TMPFILE'<br> rm $TMPFILE<br><br> if [ $CODE == 0 ] ; then<br> echo "$PREFIX;0;$RESULT"<br> else<br> echo "$PREFIX;2;Error while checking device $DEVICE"<br> fi<br><br> exit 0<br></pre> |
| 6 | 6 | |
| 7 | 7 | |
| … |
… |
|
| 9 | 9 | |
| 10 | 10 | |
| 11 | | <p>You can install the binaries by running the following command:</p><pre style="margin-left: 40px;">make install<br></pre><p>You can also copy the binaries manually—copy the <i>send_nsca</i> client to the machines that will send the results to Nagios, and send <i>nsca</i> to the machine where Nagios is running.</p><h1>Configuring the NSCA Server</h1><p>You now have working binaries for the NSCA server—either compiled from sources or installed from packages. We can now proceed with configuring the NSCA server to listen for incoming connections.</p><p>There are a couple of ways in which it can be set up—either as a standalone process that handles incoming connections, as part of <i>inetd</i> (visit <i>http://en.wikipedia.org/wiki/inetd</i>), or as the <i>xinetd</i> setup (visit <i>http://www.xinetd.org/</i>). In either cases, we will need a configuration file that will tell it which encryption algorithm to use, and the password that will be used to authenticate NSCA client connections. NSCA also needs to know the path of the Nagios command line.</p><p>The main difference between these two installation types is that the standalone version requires fewer resources to handle a larger number of incoming connections. On the other hand, <i>inetd</i> or <i>xinetd</i> based NSCA is much easier to set up. An <i>inetd</i> based setup is easier to maintain. Several <i>inetd</i> implementations also allow the configuration of connections only from specific IP addresses, or the acceptance of connections only from specific users for UNIX systems. There is no best way in which NSCA should be set up.</p><p>The configuration file is similar to the main Nagios configuration file—each parameter is written in the form of ‹name› = ‹value›. If you compiled NSCA from the source, a default configuration can be found in the <i>sample-config/nsca.cfg file</i>.</p><p>The first parameter that should be set is password. This should be set to the same value for the NSCA server and all NSCA clients. It's best to set it to a random string. Using a dictionary-based password might leave your Nagios setup susceptible to attacks—malicious users might send fake results that cause event triggers to perform specific actions.</p><p>Another option that needs to be set is <i>decryption_method</i>, which specifies the algorithm to be used for encryption. This is an integer value—a list of possible values and what they mean can be found in the sample configuration file. Both <i>decryption_method</i> and <i>password</i> need to be specified as the same on the server side and the client side.</p><p>A sample configuration is as follows:</p><pre style="margin-left: 40px;">server_address=192.168.1.1<br>server_port=5667<br>nsca_user=nagios<br>nsca_group=nagioscmd<br>command_file=/var/nagios/rw/nagios.cmd<br>password=ok1ij2uh3yg<br>decryption_method=1<br></pre><p>The option <i>server_address</i> is optional, and specifies the IP address that NSCA should listen on. If omitted, NSCA will listen on all available IP addresses for incoming connections. When it is specified, NSCA will only accept connections on the specified IP address.</p><p>The remainder of this section will assume that the NSCA server configuration file is located as <i>/etc/nagios/nsca.cfg</i>. At this point, it is good to create an NSCA confi guration based on the example above or the sample NSCA configuration file.</p><p>The fastest way to start NSCA is to start it manually in standalone mode. In this mode, NSCA handles listening on the specified TCP port and changing the user/group by itself.</p><p>To do this, simply run the NSCA binary with the following parameters:</p><pre style="margin-left: 40px;">/opt/nagios/bin/nsca -c /etc/nagios/nsca.cfg --daemon<br></pre><p>If you plan to have NSCA start up along with Nagios, it is a good idea to add a line to your <i>/etc/init.d/nagios</i> script that runs Nagios at system boot. Running NSCA should go in the <i>start</i> section, and stopping NSCA (via <i>killall</i> (see http://en.wikipedia.org/wiki/killall) command or using Pid File) should be put in the <i>stop</i> section of the <i>init</i> script. The NSCA source distribution also comes with a script that can be placed as <i>/etc/init.d/nagios</i> to start and stop the NSCA server.</p><p>Another possibility is to configure NSCA to run from the <i>inetd</i> or <i>xinetd</i> superserver daemons. This requires adding the definition of the NSCA server to the proper configuration files, and those daemons will handle accepting connections and spawning actual NSCA processes when needed.</p><p>In order to add the NSCA definition to <i>inetd</i> or <i>xinetd</i>, we first need to add a service definition of the TCP port used. In order to do that, we need to add the following line to the <i>/etc/services</i> file:</p><pre style="margin-left: 40px;">nsca 5667/tcp<br></pre><p>This will indicate that TCP port 5677 maps to the service name <i>nsca</i>. This information is used later by the super-server daemons to map port numbers to names in the configuration.</p><p>For <i>inetd</i>, we also need to add the service configuration to the <i>/etc/inetd.conf</i> file—a sample definition is as follows:</p><pre style="margin-left: 40px;">nsca stream tcp nowait nagios /opt/nagios/bin/nsca -c /etc/nagios/<br>nsca.cfg --inetd<br></pre><p>The following entry should be written to the <i>inetd.conf</i> file as a single line. Next, we should restart <i>inetd</i> by running:</p><pre style="margin-left: 40px;">/etc/init.d/inetd reload<br></pre><p>This will cause it to reload the service definitions. NSCA should be run whenever a connection on port 5667 comes in.</p><p>Setting up NSCA using <i>xinetd</i> is very similar. All that's needed is to create a file, <i>/etc/xinetd.d/nsca</i>, with the following contents:</p><pre style="margin-left: 40px;">service nsca<br>{<br> flags = REUSE<br> socket_type = stream<br> wait = no<br> user = nagios<br> group = nagioscmd<br> server = /opt/nagios/bin/nsca<br> server_args = -c /etc/nagios/nsca.cfg --inetd<br> log_on_failure += USERID<br> disable = no<br>}<br></pre><p>Next, we need to reload <i>xinetd</i> by running:</p><pre style="margin-left: 40px;">/etc/init.d/xinetd reload<br></pre><p>And after that the NSCA should also be run when a connection on port 5677 comes in. You might add the <i>only_from</i> statement in the <i>xinetd</i> service definition to limit IP addresses from which a connection can come in. It works differently from <i>server_address</i> in the NSCA configuration. The <i>only_from</i> option specifies the addresses of the remote machines that will be allowed to connect. On the other hand, the <i>server_address</i> option is used to specify the IP addresses that NSCA will listen on.</p><p>When running under <i>inetd</i> or <i>xinetd</i>, the NSCA server ignores the <i>server_address</i>, <i>server_port</i>, <i>nsca_user</i>, and <i>nsca_group</i> parameters from the configuration files. These attributes are configured at the <i>inetd/xinetd</i> level. These attributes are only meaningful when running NSCA in standalone mode.</p><h1>Sending Results over NSCA</h1><p>Now that our NSCA server is up and running, we can continue with actually submitting results over the network. We will need the <i>send_nsca</i> client binary on all of the machines that will report passive check results to Nagios.</p><p>There are various prebuilt binaries available at NagiosExchange, including a native Win32 binary, which allows the sending of results from any check using NSCA. As it is a prebuilt version, there is no need to compile or install it. Simply copy the binary to a Windows machine, and it can be used with any valid NSCA client configuration.</p><p>As with the NSCA server, the client uses a configuration file. This requires the specification of the <i>password</i> and <i>encryption_method</i> parameters. A sample configuration that can be used in conjunction with the configuration for a server created earlier:</p><pre style="margin-left: 40px;">password=ok1ij2uh3yg<br>encryption_method=1<br></pre><p>The NSCA client accepts the status results that should be sent out to the server on standard input. Each line indicates a single result from a check. The syntax of the host check result that should be passed to <i>send_nsca</i> is as follows:</p><pre style="margin-left: 40px;"><hostname>[TAB]<return code>[TAB]<plugin output><br></pre><p>The return code is the same as that for sending passive checks—0 for <i>UP</i>, 1 for <i>DOWN</i>, and 2 for <i>UNREACHABLE</i>.</p><p>Sending a passive service check result requires the specification of the service name as well:</p><pre style="margin-left: 40px;"><hostname>[TAB]<service name>[TAB]<return code>[TAB]<plugin output><br></pre><p>In this case, the return codes are the same as the exit codes for checks, and are 0 for <i>OK</i>, 1 for <i>WARNING</i>, 2 for <i>CRITICAL</i>, and 3 for <i>UNKNOWN</i>. The command differentiates the host and service checks by the number of fi elds that are passed in a line.</p><p>The NSCA client command has the following syntax:</p><pre style="margin-left: 40px;">send_nsca -H <host_address> [-c config_file]<br>[-p port] [-to to_sec] [-d delim]<br></pre><p>The <i>-H</i> option specifies the name of the NSCA server that messages should be transmitted to. The option specifies <i>-p</i> the port to send messages on; the port defaults to 5667 if nothing is specified. The timeout in seconds is specified using the <i>-to</i> flag. A field delimiter can also be specified using the <i>-d</i> option; if this is omitted, it defaults to tab-delimited.</p><p>The easiest way to test if you can send data to NSCA correctly is to try to send a host status for a valid computer. As <i>send_nsca</i> accepts information on standard input, it is enough to run an echo command and send its output to the NSCA client.</p><p>A sample script is provided as follows:</p><pre style="margin-left: 40px;">#!/bin/sh<br><br>HOST=localhost<br>NSCAHOST=127.0.0.1<br><br>echo -e "$HOSTt1tHost temporarily down" | <br> /opt/nagios/bin/send_nsca –H $NSCAHOST<br> –c /etc/nagios/send_nsca.cfg<br><br>exit 0<br></pre><p>The script will send a report that the host, localhost, is currently down with the status description, <i>Host temporarily down</i>. The <i>NSCAHOST</i> variable is used to specify the destination to which the NSCA server should send messages. While the example above is set to <i>127.0.0.1</i>, it should be replaced with the actual IP address of your Nagios server.</p><p>A similar script can be written for sending service related reports to Nagios. The only difference is that the return codes mean something different, and that the service name is sent along with the host name.</p><p>The following is an example that sends a warning state:</p><pre style="margin-left: 40px;">#!/bin/sh<br><br>HOST=localhost<br>SERVICE="NSCA test"<br>NSCAHOST=127.0.0.1<br><br>echo -e "$HOSTt$SERVICEt1tService in warning state" | <br> /opt/nagios/bin/send_nsca -H $NSCAHOST<br> -c /etc/nagios/send_nsca.cfg<br><br>exit 0<br></pre><p>This example sends out a <i>warning</i> status to Nagios over NSCA. The parameters are very similar and the main difference is in the return codes. Morever, a service description also needs to be passed; in this case, it is <i>NSCA</i> test.</p><p style="margin-left: 40px; margin-right: 40px;"><i>If the service has max_check_attempts set to anything other than 1, the script above needs to send out multiple status messages to Nagios. This can be done by piping multiple echo commands into a single send_nsca.</i></p><p>Applications that pass multiple results over a short period of time might pass multiple status results without having to re-run <i>send_nsca</i> for each of the result. Instead, you can simply send multiple lines to the same <i>send_nsca</i> process, and it will send information on all of the status to Nagios. This approach reduces the overhead of spawning multiple new processes.</p><h1>Security Concerns</h1><p>Both passive checks and NSCA allow the sending of the status about machines and applications to Nagios. This produces several types of security concerns. If a malicious user is able to send reports to Nagios, he or she can force a change to the status of one or more objects by frequently sending its status. He or she can also flood Nagios or NSCA with a large number of invalid requests that might cause performance problems. This might stop Nagios from receiving actual passive check results. For example, SNMP traps may not be passed to Nagios and, therefore, an event handler will not be triggered to fix a problem when it should have been.</p><p>This is why being able to send results to Nagios should be made as secure as possible, so that only authorized applications can communicate with it. Securing passive checks that are sent directly over external commands pipe is relatively easy. It only requires the external commands pipe to be accessible to Nagios and to the applications that are allowed to send data to it.</p><p>Securing NSCA is a more complex issue and requires ensuring that every step of the communication is secure. The fi rst step = is to make sure that the NSCA confi guration fi les have adequate access rights. They should be set so that the NSCA daemon and clients are able to read them, but other users cannot. In the client case, the issue is that all users who invoke <i>send_nsca</i> should be able to read its configuration file. This will ensure that your NSCA password and encryption methods cannot be read by unauthorized users.</p><p>Another thing that affects your setup security is whether the password used for communications is strong. It is recommended that you use a random password composed of lower case and upper case letters, as well as digits. It is also recommended that you use one of the MCrypt based algorithms, and not use the simple XOR algorithm.</p><p>The next step is to make sure that only authorized IP addresses are allowed to send information to the NSCA server. This can be done either through <i>xinetd</i> configuration or by using a system firewall such as <b>netfilter</b> or <b>iptables </b>(<a target="_blank" href="http://www.netfilter.org">http://www.netfilter.org/</a>) for Linux. In both cases, it is best to define a list of allowed IPs and automatically reject connections from unknown hosts.</p><h1>Summary</h1><p>Nagios allows both the monitoring of services on its own, and the receipt of information about computer and service statuses from other applications. Being able to send results directly to Nagios creates a lot of opportunities for extending how Nagios can be used. Pushing passive checks to Nagios also introduces security issues that should be addressed when implementing such a set-up. Both the external commands pipe and the NSCA that is used to send results to Nagios need to be set up in a secure manner to avoid issues such as unauthorized results being retrieved by Nagios.</p> {/literal} |
| | 11 | <p>You can install the binaries by running the following command:</p><pre style="margin-left: 40px;">make install<br></pre><p>You can also copy the binaries manually—copy the <i>send_nsca</i> client to the machines that will send the results to Nagios, and send <i>nsca</i> to the machine where Nagios is running.</p><h1>Configuring the NSCA Server</h1><p>You now have working binaries for the NSCA server—either compiled from sources or installed from packages. We can now proceed with configuring the NSCA server to listen for incoming connections.</p><p>There are a couple of ways in which it can be set up—either as a standalone process that handles incoming connections, as part of <i>inetd</i> (visit <i>http://en.wikipedia.org/wiki/inetd</i>), or as the <i>xinetd</i> setup (visit <i>http://www.xinetd.org/</i>). In either cases, we will need a configuration file that will tell it which encryption algorithm to use, and the password that will be used to authenticate NSCA client connections. NSCA also needs to know the path of the Nagios command line.</p><p>The main difference between these two installation types is that the standalone version requires fewer resources to handle a larger number of incoming connections. On the other hand, <i>inetd</i> or <i>xinetd</i> based NSCA is much easier to set up. An <i>inetd</i> based setup is easier to maintain. Several <i>inetd</i> implementations also allow the configuration of connections only from specific IP addresses, or the acceptance of connections only from specific users for UNIX systems. There is no best way in which NSCA should be set up.</p><p>The configuration file is similar to the main Nagios configuration file—each parameter is written in the form of ‹name› = ‹value›. If you compiled NSCA from the source, a default configuration can be found in the <i>sample-config/nsca.cfg file</i>.</p><p>The first parameter that should be set is password. This should be set to the same value for the NSCA server and all NSCA clients. It's best to set it to a random string. Using a dictionary-based password might leave your Nagios setup susceptible to attacks—malicious users might send fake results that cause event triggers to perform specific actions.</p><p>Another option that needs to be set is <i>decryption_method</i>, which specifies the algorithm to be used for encryption. This is an integer value—a list of possible values and what they mean can be found in the sample configuration file. Both <i>decryption_method</i> and <i>password</i> need to be specified as the same on the server side and the client side.</p><p>A sample configuration is as follows:</p><pre style="margin-left: 40px;">server_address=192.168.1.1<br>server_port=5667<br>nsca_user=nagios<br>nsca_group=nagioscmd<br>command_file=/var/nagios/rw/nagios.cmd<br>password=ok1ij2uh3yg<br>decryption_method=1<br></pre><p>The option <i>server_address</i> is optional, and specifies the IP address that NSCA should listen on. If omitted, NSCA will listen on all available IP addresses for incoming connections. When it is specified, NSCA will only accept connections on the specified IP address.</p><p>The remainder of this section will assume that the NSCA server configuration file is located as <i>/etc/nagios/nsca.cfg</i>. At this point, it is good to create an NSCA confi guration based on the example above or the sample NSCA configuration file.</p><p>The fastest way to start NSCA is to start it manually in standalone mode. In this mode, NSCA handles listening on the specified TCP port and changing the user/group by itself.</p><p>To do this, simply run the NSCA binary with the following parameters:</p><pre style="margin-left: 40px;">/opt/nagios/bin/nsca -c /etc/nagios/nsca.cfg --daemon<br></pre><p>If you plan to have NSCA start up along with Nagios, it is a good idea to add a line to your <i>/etc/init.d/nagios</i> script that runs Nagios at system boot. Running NSCA should go in the <i>start</i> section, and stopping NSCA (via <i>killall</i> (see http://en.wikipedia.org/wiki/killall) command or using Pid File) should be put in the <i>stop</i> section of the <i>init</i> script. The NSCA source distribution also comes with a script that can be placed as <i>/etc/init.d/nagios</i> to start and stop the NSCA server.</p><p>Another possibility is to configure NSCA to run from the <i>inetd</i> or <i>xinetd</i> superserver daemons. This requires adding the definition of the NSCA server to the proper configuration files, and those daemons will handle accepting connections and spawning actual NSCA processes when needed.</p><p>In order to add the NSCA definition to <i>inetd</i> or <i>xinetd</i>, we first need to add a service definition of the TCP port used. In order to do that, we need to add the following line to the <i>/etc/services</i> file:</p><pre style="margin-left: 40px;">nsca 5667/tcp<br></pre><p>This will indicate that TCP port 5677 maps to the service name <i>nsca</i>. This information is used later by the super-server daemons to map port numbers to names in the configuration.</p><p>For <i>inetd</i>, we also need to add the service configuration to the <i>/etc/inetd.conf</i> file—a sample definition is as follows:</p><pre style="margin-left: 40px;">nsca stream tcp nowait nagios /opt/nagios/bin/nsca -c /etc/nagios/<br>nsca.cfg --inetd<br></pre><p>The following entry should be written to the <i>inetd.conf</i> file as a single line. Next, we should restart <i>inetd</i> by running:</p><pre style="margin-left: 40px;">/etc/init.d/inetd reload<br></pre><p>This will cause it to reload the service definitions. NSCA should be run whenever a connection on port 5667 comes in.</p><p>Setting up NSCA using <i>xinetd</i> is very similar. All that's needed is to create a file, <i>/etc/xinetd.d/nsca</i>, with the following contents:</p><pre style="margin-left: 40px;">service nsca<br>{<br> flags = REUSE<br> socket_type = stream<br> wait = no<br> user = nagios<br> group = nagioscmd<br> server = /opt/nagios/bin/nsca<br> server_args = -c /etc/nagios/nsca.cfg --inetd<br> log_on_failure += USERID<br> disable = no<br>}<br></pre><p>Next, we need to reload <i>xinetd</i> by running:</p><pre style="margin-left: 40px;">/etc/init.d/xinetd reload<br></pre><p>And after that the NSCA should also be run when a connection on port 5677 comes in. You might add the <i>only_from</i> statement in the <i>xinetd</i> service definition to limit IP addresses from which a connection can come in. It works differently from <i>server_address</i> in the NSCA configuration. The <i>only_from</i> option specifies the addresses of the remote machines that will be allowed to connect. On the other hand, the <i>server_address</i> option is used to specify the IP addresses that NSCA will listen on.</p><p>When running under <i>inetd</i> or <i>xinetd</i>, the NSCA server ignores the <i>server_address</i>, <i>server_port</i>, <i>nsca_user</i>, and <i>nsca_group</i> parameters from the configuration files. These attributes are configured at the <i>inetd/xinetd</i> level. These attributes are only meaningful when running NSCA in standalone mode.</p><h1>Sending Results over NSCA</h1><p>Now that our NSCA server is up and running, we can continue with actually submitting results over the network. We will need the <i>send_nsca</i> client binary on all of the machines that will report passive check results to Nagios.</p><p>There are various prebuilt binaries available at NagiosExchange, including a native Win32 binary, which allows the sending of results from any check using NSCA. As it is a prebuilt version, there is no need to compile or install it. Simply copy the binary to a Windows machine, and it can be used with any valid NSCA client configuration.</p><p>As with the NSCA server, the client uses a configuration file. This requires the specification of the <i>password</i> and <i>encryption_method</i> parameters. A sample configuration that can be used in conjunction with the configuration for a server created earlier:</p><pre style="margin-left: 40px;">password=ok1ij2uh3yg<br>encryption_method=1<br></pre><p>The NSCA client accepts the status results that should be sent out to the server on standard input. Each line indicates a single result from a check. The syntax of the host check result that should be passed to <i>send_nsca</i> is as follows:</p><pre style="margin-left: 40px;"><hostname>[TAB]<return code>[TAB]<plugin output><br></pre><p>The return code is the same as that for sending passive checks—0 for <i>UP</i>, 1 for <i>DOWN</i>, and 2 for <i>UNREACHABLE</i>.</p><p>Sending a passive service check result requires the specification of the service name as well:</p><pre style="margin-left: 40px;"><hostname>[TAB]<service name>[TAB]<return code>[TAB]<plugin output><br></pre><p>In this case, the return codes are the same as the exit codes for checks, and are 0 for <i>OK</i>, 1 for <i>WARNING</i>, 2 for <i>CRITICAL</i>, and 3 for <i>UNKNOWN</i>. The command differentiates the host and service checks by the number of fi elds that are passed in a line.</p><p>The NSCA client command has the following syntax:</p><pre style="margin-left: 40px;">send_nsca -H <host_address> [-c config_file]<br>[-p port] [-to to_sec] [-d delim]<br></pre><p>The <i>-H</i> option specifies the name of the NSCA server that messages should be transmitted to. The option specifies <i>-p</i> the port to send messages on; the port defaults to 5667 if nothing is specified. The timeout in seconds is specified using the <i>-to</i> flag. A field delimiter can also be specified using the <i>-d</i> option; if this is omitted, it defaults to tab-delimited.</p><p>The easiest way to test if you can send data to NSCA correctly is to try to send a host status for a valid computer. As <i>send_nsca</i> accepts information on standard input, it is enough to run an echo command and send its output to the NSCA client.</p><p>A sample script is provided as follows:</p><pre style="margin-left: 40px;">#!/bin/sh<br><br>HOST=localhost<br>NSCAHOST=127.0.0.1<br><br>echo -e "$HOSTt1tHost temporarily down" | <br> /opt/nagios/bin/send_nsca –H $NSCAHOST<br> –c /etc/nagios/send_nsca.cfg<br><br>exit 0<br></pre><p>The script will send a report that the host, localhost, is currently down with the status description, <i>Host temporarily down</i>. The <i>NSCAHOST</i> variable is used to specify the destination to which the NSCA server should send messages. While the example above is set to <i>127.0.0.1</i>, it should be replaced with the actual IP address of your Nagios server.</p><p>A similar script can be written for sending service related reports to Nagios. The only difference is that the return codes mean something different, and that the service name is sent along with the host name.</p><p>The following is an example that sends a warning state:</p><pre style="margin-left: 40px;">#!/bin/sh<br><br>HOST=localhost<br>SERVICE="NSCA test"<br>NSCAHOST=127.0.0.1<br><br>echo -e "$HOSTt$SERVICEt1tService in warning state" | <br> /opt/nagios/bin/send_nsca -H $NSCAHOST<br> -c /etc/nagios/send_nsca.cfg<br><br>exit 0<br></pre><p>This example sends out a <i>warning</i> status to Nagios over NSCA. The parameters are very similar and the main difference is in the return codes. Morever, a service description also needs to be passed; in this case, it is <i>NSCA</i> test.</p><p style="margin-left: 40px; margin-right: 40px;"><i>If the service has max_check_attempts set to anything other than 1, the script above needs to send out multiple status messages to Nagios. This can be done by piping multiple echo commands into a single send_nsca.</i></p><p>Applications that pass multiple results over a short period of time might pass multiple status results without having to re-run <i>send_nsca</i> for each of the result. Instead, you can simply send multiple lines to the same <i>send_nsca</i> process, and it will send information on all of the status to Nagios. This approach reduces the overhead of spawning multiple new processes.</p><h1>Security Concerns</h1><p>Both passive checks and NSCA allow the sending of the status about machines and applications to Nagios. This produces several types of security concerns. If a malicious user is able to send reports to Nagios, he or she can force a change to the status of one or more objects by frequently sending its status. He or she can also flood Nagios or NSCA with a large number of invalid requests that might cause performance problems. This might stop Nagios from receiving actual passive check results. For example, SNMP traps may not be passed to Nagios and, therefore, an event handler will not be triggered to fix a problem when it should have been.</p><p>This is why being able to send results to Nagios should be made as secure as possible, so that only authorized applications can communicate with it. Securing passive checks that are sent directly over external commands pipe is relatively easy. It only requires the external commands pipe to be accessible to Nagios and to the applications that are allowed to send data to it.</p><p>Securing NSCA is a more complex issue and requires ensuring that every step of the communication is secure. The fi rst step = is to make sure that the NSCA confi guration fi les have adequate access rights. They should be set so that the NSCA daemon and clients are able to read them, but other users cannot. In the client case, the issue is that all users who invoke <i>send_nsca</i> should be able to read its configuration file. This will ensure that your NSCA password and encryption methods cannot be read by unauthorized users.</p><p>Another thing that affects your setup security is whether the password used for communications is strong. It is recommended that you use a random password composed of lower case and upper case letters, as well as digits. It is also recommended that you use one of the MCrypt based algorithms, and not use the simple XOR algorithm.</p><p>The next step is to make sure that only authorized IP addresses are allowed to send information to the NSCA server. This can be done either through <i>xinetd</i> configuration or by using a system firewall such as <b>netfilter</b> or <b>iptables </b>(<a target="_blank" href="http://www.netfilter.org">http://www.netfilter.org/</a>) for Linux. In both cases, it is best to define a list of allowed IPs and automatically reject connections from unknown hosts.</p><h1>Summary</h1><p>Nagios allows both the monitoring of services on its own, and the receipt of information about computer and service statuses from other applications. Being able to send results directly to Nagios creates a lot of opportunities for extending how Nagios can be used. Pushing passive checks to Nagios also introduces security issues that should be addressed when implementing such a set-up. Both the external commands pipe and the NSCA that is used to send results to Nagios need to be set up in a secure manner to avoid issues such as unauthorized results being retrieved by Nagios.</p> |
| 12 | 12 | }}} |
|
|