FIDO: Correlation

FIDO: Correlation

FIDO performs multiple types of correlation.

1) Topology based

* unreachable IPs or nodes based on layer 3 topology [traceroute]
* service IP alarms to ICMP unreachable alarms
* ICMP unreachable alarms to known device interfaces
* LLDP/CDP based correlation, node based or interface based
* port channel/aggregated ethernet members

Correlation can occur in either a module or fido.pl itself.  fido.pl correlation rules are described in utils/FidoCorrelation.pm



2) Comment based [aka human intervention]

* items that are tagged with the same comment are group together



3) Alarm attribute based [fido_group_correlation.yaml]

* As of 2014/03/08, only applies to alarms that have not been correlated based on topology or comment based.  Expired comments still count as comments.
* Attributes of alarms can be examined and if a positive match is made, the alarm is added to the evaluated group.  An alarm can be a part of multiple groups; groups are then evaluated in priority order.  

example:




As seen on the report

[m7h@fido-cssc scripts]$ /usr/local/ns/bin/json_dumper.pl /var/local/fido/reports/fido.report.bin | less
...
...
group_correlated:
  fp-432nm-b3a-12-node-stby134s.net.wisc.edu ifOperStatus tunnel.3-ifOperStatus:
    group: group='info=UW Milwaukee extension IPSEC tunnel'
    members:
      fp-cssc-b380g12-12-node-pri134s.net.wisc.edu ifOperStatus tunnel.3-ifOperStatus: 1
  rx-animal-226-2-core bundle-ether4-fido_campus_errors:
    group: group='rx-animal-226-2-core_threshold violation_fido_campus_errors'
    members:
      rx-animal-226-2-core tengige0-9-0-6-fido_campus_errors: 1

As seen in the config

[user@fido-cssc scripts]$ cd /usr/local/fido/etc/local_config/
[user@fido-cssc local_config]$ cat fido_group_correlation.yaml  

attributes:
  '68':
    group_eval: return 'info=' . $$item_ptr{'infohash'}{'Descr'}
    matches:
      '10':
        defined: '1'
        key_match: ___infohash___Descr
 ...
   # below will group things together such as in/out bandwith alarm on same node.
   # but it can snarf up other unwanted stuff, so it is low priority
  '220':
    group_eval: return $$item_ptr{'device'} . '_' . $$item_ptr{'state'} . '_' . $$item_ptr{'test'}
    matches:
      '10':
        defined: '1'
        key_match: state
      '20':
        defined: '1'
        key_match: test
      '5':
        defined: '1'
        key_match: device






In a 2017/09 wan-routing meeting while discussing FIDO we came across the fido_correlation.config <device_topology_priority> stanza. It currently has two uses
1) [./bin/update/update_icmp_ips.pl]: when selecting a PTR for an IP, if there is more than one, the device with the highest priority wins.
2) [./lib/FidoCorrelation.pm]: When doing topology correlation I try to rely on l2 (CDP/LLDP) or l3 traceroute data based on distance from root node.  If there is a break in topology continuity (for example, not CDP path from root to device but there is between two devices) the topology importance dictates which alarm should be the parent vs child from a correlation perspective. 
This value is calculated by $FIDO/bin/update/update_topology_info.pl.



Keywords:FIDO: Correlation   Doc ID:38262
Owner:Michael H.Group:Network Services
Created:2014-03-08 17:35 CSTUpdated:2019-12-03 12:23 CST
Sites:Network Services, Systems & Network Control Center, University of Wisconsin System Network
Feedback:  0   0