This is where the NRPE plugins are much better as you can get much more granular when monitoring the memory on a Windows host.
To start with we need to create a new command definition. Add this to your commands.cfg (or equivalent):
# CheckWindowsPhysical Mem command definition
define command {
command_name CheckWindowsPhysicalMem
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -p 5666 -c CheckMEM -a MaxWarn=$ARG1$% MaxCrit=$ARG2$% ShowAll type=physical
}
In the above command definition we're using the check_nrpe executable to perform a memory check of the physical memory. The type can be changed to grab just the page file or check the entire virtual memory address space.
Next we need add the physical memory checks by adding a service definition to either your host or service configs (again, depends on how you've structured your NAGIOS configuration).
# Service definition
# Add the service to the service definition
define service {
service_description Physical Memory
check_command CheckWindowsPhysicalMem!80!90
host_name << hostname >>
event_handler_enabled 0
active_checks_enabled 1
passive_checks_enabled 0
notifications_enabled 1
check_freshness 0
freshness_threshold 86400
use << service template >>
}
You will need to update the above snippet with the host name you are monitoring and the service template you are using. The !80!90 is the standard warning at 80% usage, critical at 90% usage. These can be varied to suit your host and environment.
I added the command as you have mentioned but it gives following error:
ReplyDeleteCOMMAND: /usr/local/nagios/libexec/check_nrpe -H 172.16.56.101 -p 5666 -c CheckMEM -a MaxWarn=80% MaxCrit=90% ShowAll type=physical
OUTPUT: Could not construct return packet in NRPE handler check client side (nsclient.log) logs...
When I check client side logs:
2013-07-23 16:38:21: error:modules\CheckSystem\CheckSystem.cpp:1084: ERROR: Counter not found: \Server\Logon Errors: The specified counter could not be found. (C0000BB9)
2013-07-23 16:38:21: error:modules\CheckSystem\CheckSystem.cpp:1086: ERROR: Counter not found: \Server\Logon Errors: The specified counter could not be found. (C0000BB9)
2013-07-23 16:38:21: error:modules\CheckSystem\CheckSystem.cpp:1115: ERROR: \Server\Logon Errors: PdhAddCounter failed: The specified counter could not be found. (C0000BB9) (\Server\Logon Errors|\Server\Logon Errors)
Hi everyone,
ReplyDeleteany update/input on this issue?
Regards,
Avinash
Hi Avinash,
ReplyDeleteWhat happens when you run the check_nrpe command directly from the Nagios command line? For example:
/usr/lib/nagios/plugins/check_nrpe -H -p 5666 -c CheckMEM ShowAll type=physical -a MaxWarn=10% MaxCrit=20%