FIDO: Recurring items and comment stickiness

FIDO: Recurring items and comment stickiness
When an alarm is added to the FIDO database, FIDO keeps track of when the item was first and last seen.  If the alarm clears but recurs before enough time passes, the alarm is considered to be recurring.  The FIDO daemon occasionally purges non-recurring alarms from its database based on fido_attributes.yaml matches
As of current implementation, when an alarm is =removed= from the active FIDO database, a YAML calculation is made and the results are stored in the "comments" area of the FIDO database.  This means that you cannot see the actual recurrence length for a given alarm until after the alarm is removed from the database.  Because of this, it is difficult to use GUI tools to view this data, but it is present in /var/local/fido/reports/archive/ which can be examined with jq or json_dumper.  An example:

[@fido-cssc archive]$ json_dumper.pl 2023-01-26/fido.report.bin-2023-01-26-09-22-22.gz  | less
  10.130.49.3-icmpv4:
    comment_history:
    - discovery: pending
      first_seen: 1674742587
      occurrences: 1
      related_event_id: '20230126.2011'
    discovery: '1591967697'
    first_seen: 1674742587
    last_seen: 1674746362
    length: 401
    occurrences: 2
    recurring:
      matches:
        '10':
          defined: '1'
          key_match: test
      reason: default
      time: '60'
      until: 1674749962
    recurring_event_id: '20230126.3625'
    related_event_id: '20230126.2011'
You =can= take the alarm name and plug it into fido_alarm_info.cgi to see this in the GUI.
Example: https://fido-cssc.net.wisc.edu/cgi-bin/fido_alarm_info.cgi?item=10.130.49.3-icmpv4&test=icmpv4
Despite this caveat, this is more informative than the old method, which was internally calculated within fido.pl with no external visibility to the user possible.  We could change fido.pl to calculate/update the recurring value on every loop while the alarm is present, but it would be at the detriment of CPU cycles and scalability.
Example from fido_attributes.yaml
---
attributes:
  130000:
    fido_recurrence:
      reason: daily
      time: 1440    # 1 day
    matches:
      '10':
        key_match: test
        equal:
          PoEAllocatedPower: ''
          fido_campus_errors: ''
          fido_dhcp_usage: ''
          upsAdvBatteryTemperature: ''
  130010:
    fido_recurrence:
      reason: daily
      time: 1440  # 1 day
    matches:
      '10':
        key_match: test
        equal: AlphaalarmState
      '20':
        key_match: descr
        match: Battery Temperature (High|Low)
        match_re: 'true'

  # matching default
  139999:
    fido_recurrence:
      reason: default
      time: 60    # 1 hour

    matches:
      '10':
        key_match: test
        defined: '1'


Legacy [as of 2023/01/26] Implementation

When an alarm is added to the FIDO database, FIDO keeps track of when the item was first and last seen.  If the alarm clears but recurs before enough time passes, the alarm is considered to be recurring.  The FIDO daemon occasionally purges non-recurring alarms from its database based on a regular expression match against the alarm name.  

These values are configured in the fido.yaml

comments_cleanse:
  '3600':
    .*: ''
  '86400':
    PoEAllocatedPower: ''
    fido_campus_errors: ''
    fido_dhcp_usage: ''
    upsAdvBatteryTemperature: ''

Currently this feature works on the "comments" area of the FIDO database.  The "comments" area has no access to alarm attributes after an alarm is cleared.  For each iteration of the saved FIDO database, for each comment where an alarm is no longer active, FIDO performs a calculation to see if the comment should be forgotten from a recurrence perspective.  This method allow us to change the formula [as needed] for comments_cleanse after an alarm has cleared.  

Let's say in the future we wanted to be able to calculate the comment_cleanse time based on an alarm attribute in the FIDO database or GNMIS attribute for an associated RRD [if applicable].  We would move the calculation of comment recurrence expiry to when the alarm clears from the FIDO database.  This would provide us access to the full alarm attribute list.  The downside to this approach is that once this value is calculated it cannot be changed, since the calculation would be tied to when the alarm was removed.  This is probably not a big deal, but something to consider.


KeywordsFIDO: Recurring items and comment stickiness comments_cleanse   Doc ID38255
OwnerMichael H.GroupNetwork Services
Created2014-03-08 16:34:33Updated2023-01-26 10:29:31
SitesNetwork Services, Systems & Network Control Center
Feedback  0   0