FIDO: process watchdog

FIDO: process watchdog

On the campus FIDO instance, multiple times it has been observed that the fido_snmp non blocking call to Net::SNMP::snmp_dispatcher() doesn't return.  In 2013, I introduced the non-standard module AnyEvent::SNMP which replaces Net::SNMP's event loop.  Many months went by but the problem recurred.

/etc/cron.d/ns-2m-FIDO-watchdog
*/2 * * * *     root    /usr/local/fido/bin/fido_watchdog.pl 2>&1 | /usr/bin/logger -p daemon.warning -t fido_watchdog.pl.logger

The fido_watchdog opens the latest FIDO status file and looks for tests that have gone STALE that and have restart instructions.  If conditions have been met, the watchdog attempts to restart the stalled processes.

As of 2017/08, restart instructions are in place for several tests



Keywords:
FIDO: process watchdog 
Doc ID:
38261
Owned by:
Michael H. in Network Services
Created:
2014-03-08
Updated:
2024-10-01
Sites:
Network Services, Systems & Network Control Center