FIDO: process watchdog

FIDO: process watchdog

On the campus FIDO instance, multiple times it has been observed that the fido_snmp non blocking call to Net::SNMP::snmp_dispatcher() doesn't return.  In 2013, I introduced the non-standard module AnyEvent::SNMP which replaces Net::SNMP's event loop.  Many months went by but the problem recurred.

/etc/cron.d/ns-2m-FIDO-watchdog
*/2 * * * *     root    /usr/local/fido/bin/fido_watchdog.pl 2>&1 | /usr/bin/logger -p daemon.warning -t fido_watchdog.pl.logger

The fido_watchdog opens the latest FIDO status file and looks for tests that have gone STALE that and have restart instructions.  If conditions have been met, the watchdog attempts to restart the stalled processes.

As of 2017/08, restart instructions are in place for several tests



KeywordsFIDO: process watchdog   Doc ID38261
OwnerMichael H.GroupNetwork Services
Created2014-03-08 16:53:40Updated2024-10-01 05:27:29
SitesNetwork Services, Systems & Network Control Center
Feedback  0   0