Things I use in this entry:
Dell OpenManage Server Administrator
Nagios plugin check_omsa_snmp.pl
Windows Server 2003
At present, the servers that I am responsible for are monitored from a 30,000' view, its getting progressively closer to the ground and services or more precisely groups of services are being monitored. For instance, there are several services that our exchange server depends on but I don't want to and don't think it is necessary to report on each one directly, a better approach IMO is to use nc_net on each windows server to be monitored then have one service to view these as a single point of failure.
I had been asked to look at monitoring the physical hardware of each server and the only real way to do this would be using SNMP and the Dell OpenManage Server Administrator tools.
1. Windows Server has SNMP available as an additional built-in component, it can be installed using Add/Remove Windows Components under Management and Monitoring tools.
Feel free to install all of these as most of them are useful but for our purposes, only the installation of the Simple Network Management Protocol component is required.
2. Install the Dell OpenManage Server Administrator software found here
3. Open up services.msc and get the properties for the SNMP agent, click on the security tab, then untick the send authentication trap.
In the upper section, click on add an SNMP community (as read-only), remember this name as you will need it later.
In the lower section add the address of your Nagios as one of the accepted hosts. Make sure to leave in localhost, or if you aren't paranoid (AND WHY NOT????) you can tick the box saying accept from any host.
4. Go to Nagios Exchange and download check_omsa_snmp.pl to your Nagios server into /usr/local/nagios/libexec (default config assumed)
5. chmod +x that file and test your connection to the SNMP enabled server using snmpwalk -v 2c -c COMMUNITYNAME HOSTNAME .188.8.131.52.4.1.674.10892.1.700.20.1.8.1
It should return SNMPv2-SMI::enterprises.674.10892.1.700.20.1.8.1.1 = STRING: "CPU Planar"
SNMPv2-SMI::enterprises.674.10892.1.700.20.1.8.1.2 = STRING: "Ambient"
SNMPv2-SMI::enterprises.674.10892.1.700.20.1.8.1.3 = STRING: "BP Bottom Temp"
If that is successful, run this command
./check_omsa_snmp.pl -H HOSTNAME -C COMMUNITYNAME
and this will return any critical errors on your server, for example
Power Supply 2 is critical
You can then set this up as a service command where you can have the SNMP community set already.