Smokes your problems, coughs fresh air.

Tag: nagios

Better nagios SMS and E-mail messages

Just for my references. I made changes to it before, but I now added the comments to them (for custom notifications and acknowledgements):

define command{
        command_name    notify-host-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nComment: $NOTIFICATIONCOMMENT$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert ($SHORTDATETIME$): $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
}
 
define command{
        command_name    notify-service-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nComment: $NOTIFICATIONCOMMENT$\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert ($SHORTDATETIME$): $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
}
 
define command{
        command_name    notify-host-by-sms
        command_line    /usr/local/sbin/send-sms.sh -n $CONTACTPAGER$ -m "$HOSTNAME$: $NOTIFICATIONTYPE$, $HOSTSTATE$ ($NOTIFICATIONCOMMENT$)"
}
 
define command{
        command_name    notify-service-by-sms
        command_line    /usr/local/sbin/send-sms.sh -n $CONTACTPAGER$ -m "$SERVICEDESC$ on $HOSTNAME$ $NOTIFICATIONTYPE$ ($NOTIFICATIONCOMMENT$): $SERVICEOUTPUT$"
}

Useful extra Nagios commands

Here are some useful extra nagios commands I often use:

define command{
        command_name    notify-host-by-sms
        command_line    /usr/local/sbin/send-sms.sh -n $CONTACTPAGER$ -m "$HOSTNAME$: $HOSTSTATE$"
}
 
define command{
        command_name    notify-service-by-sms
        # Don't use service state, otherwise you only ever get to see 'critical' and not the reason.
        command_line    /usr/local/sbin/send-sms.sh -n $CONTACTPAGER$ -m "$SERVICEDESC$ on $HOSTNAME$: $SERVICEOUTPUT$"
}
 
define command{
        command_name    check_imaps
        command_line    /usr/lib/nagios/plugins/check_imap -H '$HOSTADDRESS$' --ssl -p 993 --certificate 15
}
 
define command{
        command_name    check_rdp
        command_line    /usr/local/lib/nagios/plugins/check_x224 -H $HOSTADDRESS$ -p $ARG1$ -w 10 -c 50
}
 
define command{
  command_name  check_pops
  command_line  /usr/lib/nagios/plugins/check_pop -H '$HOSTADDRESS$' --ssl -p 995 --certificate 15
}
 
define command{
  command_name  check_smtps
  command_line  /usr/lib/nagios/plugins/check_tcp -H $HOSTADDRESS$ -p 465
}

Allowing apache to set Nagios cmd file

On debian, to prevent:

Error: Could not stat() command file ‘/var/lib/nagios3/rw/nagios.cmd’!

Do:

/etc/init.d/nagios3 stop
dpkg-statoverride --update --add nagios www-data 2710 /var/lib/nagios3/rw
dpkg-statoverride --update --add nagios nagios 751 /var/lib/nagios3
/etc/init.d/nagios3 start

source.

Checking 3ware raid controllers over ssh with nagios

First check this to see how you enable a host to be checked with nagios over SSH.

Create a command in /etc/nagios3/commands.cfg:

# This command needs this in /etc/sudoers on the target:
# nagios ALL = NOPASSWD: /usr/local/sbin/check_3ware.sh
define command {
       command_name     check_3ware
       command_line     /usr/lib/nagios/plugins/check_by_ssh -H $HOSTADDRESS$ -i /etc/nagios3/id_rsa -l nagios -t 25 -C 'sudo check_3ware.sh'
}

Run visudo and add this line:

nagios ALL = NOPASSWD: /usr/local/sbin/check_3ware.sh

Then install the script from here. Last time I did that I needed to fix bugs in it, so beware. I submitted a patch, which will be accepted I guess.

Then go download the tw_cli tool.

Then create a hostgroup for your 3ware hosts:

define hostgroup {
        hostgroup_name  3ware-machines
        alias           3Ware machines
        members         boxen
}

Then a service:

define service {
        hostgroup_name                  3ware-machines
        service_description             3Ware status
        check_command                   check_3ware
        use                             generic-service
        notification_interval           0
}

That should be it.

Configuring nagios checks over SSH

I had to do a lot of fiddling before I got nagios over ssh working. I used this article as source, mostly, even though I did it differently.

First add some commands to commands.cfg:

define command{
        command_name    check_remote_disk
        command_line    /usr/lib/nagios/plugins/check_by_ssh -p $ARG1$ -l nagios -t 30 -o StrictHostKeyChecking=no -i /etc/nagios3/id_rsa -H $HOSTADDRESS$ -C '/usr/lib/nagios/plugins/check_disk -w $ARG2$ -c $ARG3$ -p $ARG4$'
}
 
define command{
        command_name    check_remote_load
        command_line    /usr/lib/nagios/plugins/check_by_ssh -p $ARG1$ -l nagios -t 30 -o StrictHostKeyChecking=no -i /etc/nagios3/id_rsa -H $HOSTADDRESS$ -C '/usr/lib/nagios/plugins/check_load -w $ARG2$ -c $ARG3$'
}
 
define command{
        command_name    check_remote_swap
        command_line    /usr/lib/nagios/plugins/check_by_ssh -p $ARG1$ -l nagios -t 30 -o StrictHostKeyChecking=no -i /etc/nagios3/id_rsa -H $HOSTADDRESS$ -C '/usr/lib/nagios/plugins/check_swap -w $ARG2$ -c $ARG3$'
}

The file referenced by -i is an SSH secret key. You can create this key by running ssh-keygen and giving the right path. You can’t store anything in the nagios home dir, because that is in /var/run, which is cleared after a reboot. So, you can’t use the default key file.

The -t 30 is necessary because sometimes there is network lag, causing the service to be reported as failure. The default of 10 is not enough…

Next you need to create a user nagios on the target machine and add the .pub file of the key to the authorized_keys. Creating the user should be done like:

useradd --system --shell /bin/bash nagios

Then install the nagios plugins on the target host:

aptitude -P install nagios-plugins-basic

The nagios host needs to be able to log in with user nagios. On Debian, the user that runs nagios (also called ‘nagios’) doesn’t have a shell by default. So, give it a shell.

Then you can create a hostgroup (for example). You can do:

define hostgroup {
        hostgroup_name  nagios-enabled
        alias           Nagios enabled
        members         host1, host2
}

Then create services:

define service {
        hostgroup_name                  nagios-enabled
        service_description             Root partition space
        check_command                   check_remote_disk!22!20%!10%!/
        use                             generic-service
        notification_interval           0
}
 
define service {
        hostgroup_name                  nagios-enabled
        service_description             Swap space
        check_command                   check_remote_swap!22!50%!30%
        use                             generic-service
        notification_interval           0
}
 
define service {
        hostgroup_name                  nagios-enabled
        service_description             Load
        check_command                   check_remote_load!22!5.0,4.0,3.0!10.0,6.0,4.0
        use                             generic-service
        notification_interval           0
}

This will check load, swap and root space on all your standard nagios enabled hosts. Next you can define custom services:

define service {
        host_name                       piet
        service_description             Some partition
        check_command                   check_remote_disk!22!40%!30%!/mnt/dinklefat
        use                             generic-service
        notification_interval           0
}

Configuring Nagios to check a HTTP host

Nagios is an elaborate piece of software to monitor hosts and services. I will explain a bit how you can configure nagios to monitor an HTTP service. I’m assuming your nagios setup already has the default config files generic-host_nagios2.cfg and generic-service_nagios2.cfg, which tell nagios how to monitor hosts and services.

Most configuration is done in /etc/nagios3/conf.d. For some reason, the standard config files all end with _nagios2.cfg, so I guess this is old syntax. But, I don’t really know why these files are named that way.

Nagios comes with a bunch of default files to which you can add your hosts, services, etc.

First you have to define a host. If you’re monitoring on the machine itself, you could add a host to localhost_nagios2.cfg. Using the default localhost doesn’t work, because you need to access the machine using the address of the virtual host.

define host{
        use                     generic-host            ; Name of host template to use
        host_name               my-site
        address                 www.halfgaar.net
}

Then you need to define a hostgroup for your HTTP servers. A default HTTP hostgroup is probably already defined, so you can add your host to http-servers in hostgroups_nagios2.cfg

define hostgroup {
        hostgroup_name  http-servers        
        members         localhost, my-site # comma separated
}

Lastly, you need to configure a service. Nagios comes with a default one for the hostgroup http-servers so you should be done, but just in case:

define service {
        hostgroup_name                  http-servers
        service_description             HTTP
        check_command                   check_http
        use                             generic-service
        notification_interval           0 ; set > 0 if you want to be renotified
}

© 2024 BigSmoke

Theme by Anders NorenUp ↑