FIDO: 'not found' alarms from the fido_snmp test
When interacting with a device via SNMP, the snmp client receives OID and instance information. An OID can be thought of as an entry point into information of a certain class. For example, .1.3.6.1.2.1.2.1 is the OID for the ifTable, which contains information about interfaces. An instance is an arbitrary number that is mapped to humanly readable data such as interface name.
These bindings are not necessarily persistent. When the configuration is changed [software or hardware], or if a device reboots, these bindings can change. The behavior is dependent from device to device. Until caching applications 'relearn' these bindings by repolling, there can be a disconnect between device reality and an application's SNMP instance cache.
Many applications that interact with devices via SNMP cache SNMP instance information to improve performance, including FIDO. FIDO will let you know this because it will paint the row a yellow/green color and list the state as 'not found' instead of 'down'.
To help quickly clear out less useful alarms, the fido_snmp daemon will schedule a repoll of a device if it notices that the sysUptime counter has reset.
Setting to impact 3 and adding a note "interface has no description" is a toggable feature. This default was introduced quite some time ago [>1yr+] and was introduced as a way to help the AS3128 noc differentiate between something that needed escalation or not.
Reporting admin down or "not found" is alarms is also a toggle feature, although that feature's behavior has been basically changed over the last 10+ years.
If AS59 wants to remove either feature, let me know. The first feature is done by removing [or commenting out] the below. The second feature, I would want to have follow up conversations of desired behavior.
from: /home/net/ns-ansible-uwmadison/ansible/files/fido/sitelocal_config, see line 492-511, rule 100000.
492 ########################
493 # decent defaults#
494 ########################
495
496 ################## setting impact
497
498 # if this interface is so unimportant that it doesn't have a description, its an impact 3 with 15 minute holddown
499 # this rule is essentially required for Juniper LACP bundles
500 100000:
501 fido_impact:
502 reason: interface has no description
503 value: '3'
504 matches:
505 '10':
506 key_match: test
507 equal: ifOperStatus
508 '20':
509 key_match: ___infohash___Descr
510 undefined: ''
511