FIDO: File Formats

FIDO: File Formats

Description of the Status Files written by FIDO tests

Want to make your own test? Here is the format of the status files.
All tests, including FIDO proper itself, will write out a status file that contains information about the test itself and objects related to the test.
<item data> 
   <failed-object [object name]> # appended with the test type to avoid collisions
        correlation 
               item = [string]
                    # this object is related to an upstream object
               reason = [string]
                    # text description of why this correlation was made
        device = [string]
                # used by the correlation engine
        help = [string]
                # will be displayed in the gui.  If you don't supply, the test
                # filename will be used
        impact = [integer]
                # assumed 2 if not present
        info = [string]
                # gui mouseover or 'wideinfo' shows this information
        interface = [string]
                # used by the correlation engine
        ip = [ip address] 
                # used by the correlation engine and also used for DNS lookups
        iso = [integer]   
                # used for correlation
        keepme = [boolean]
                # keep this in the FIDO report file even if it isn't in the current status file
                # this is to support objects that aren't polled every cycle
        needed_failures = 
                # number of consecutive poll failures before the object shows in the gui
        state = [string]
                # state of the tested object
        suppressed = [integer]
                # when suppression is being used, how many items are not shown because of this parent item
        test = [string]
                # the type of failure, displayed in the 'type' column, often used for correlation
        time = 
                # time in ticks that the object failed during the current test period
        url = [string]
                # gui click hyperlink 

< /item data > 
< status >
        alldown = [integer]
                # false if everything isn't down, otherwise it will be the number of down items
        items = [string] 
                # total monitored items, not a count of what's down
        last_cycle_time = [integer]
        polled = [array of polled items]
                # a list of 'test' types that had been updated during this most recent cycle
                # for tests that may not be testing every test type all of the time
        report = [integer]
                # always increases by one, make sure we don't miss a report
        stale = [integer, seconds]
                # num of seconds before this data is stale
        start = [integer, ticks]
        threads = [integer]
                how many thread the test is configured to use
        url = [string]
                #points to file that contains a list of items, display in the FIDO status page
< /status >

Description of the Report Files written by FIDO

Here is a description of the report files that FIDO itself writes out.
../reports/fido.status.dat
 'layer2' => { # test name.  Entry for each test being watched.
    'status' => [string]
             # 'OK' is the test has reported recently.
             # 'MISSING' if the test status file is missing.
             # 'STALE' if the item updated is older than the stale time
    'last_update' => [integer, ticks]
    'stale' => [integer, seconds]
    'last_cycle_time' => [string, seconds]
    'threads' => [integer]
             # number of threads used by the test, if applicable
    'report' => [integer]
             # most recent report number for this test
    'url' => [string]
             # link to list of items the test is watching.
    'items' => [string]
             # brief textual description of the items being watched
    'start' => [integer, ticks]
             # what time did this test start up?

Description of fido.report.dat

The main report file that is read by the CGI

   'event_id' => [string]
       #next event will be assigned this ID

   'last_cycle_time' => [string]
       # how long it took to create the last fido report
'fido_start_time' => [integer, ticks] 'report' => [integer] # fido report number, increases by one every time a new report is written.

   'comment_correlation' =>
        'comment' => # a comment
             'best' => # the best item for this comment, as determined by FIDO.
                       # best is defined by being the item with the most things
                       # correlated to it.  Other tiebreakers are also used.
             'items' =>  # a list of items with the above comment
                 'item with the comment mentioned above'

'comments' => # where the comments are kept # they are kept separate from the items themselves # as we keep comment information for items that get deleted. '$item' => 'comment' => comment for this item 'date' => [integer, ticks]; When this comment was last updated 'expired' => { 'comment' => 'Lab Admin Power down - No Chng Rec.', <--- previous comment 'date' => '1422453633', <--- previous comment date 'expired_date' => '1422453635', <--- timestamp comment was expired 'user' => 'wfoster1' <--- previous comment username }, 'first_seen' => [integer, ticks]; time the alarm was added 'last_seen' => [integer, ticks]; time the alarm was removed 'length' => [integer, ticks]; length in seconds that this alarm has been historically alarming 'occurrences' => [integer]; number of times the alarm has been seen 'related_event_id' => [string]; can be used to link to an event in the sql log DB 'recurrent_event_id' => [string]; can be used to link to an event in the sql log DB 'user' => [string]; the user that entered the comment 'correlated' => # a list of items and what things are correlated to those items

'group_correlated' => '$item' => 'group' => evaluated name of the group 'members' { '$item 1' = 1 '$item 2' = 1 } } }

 'item_data' => # where item information is actually kept 'correlation' OR 'test_supplied_correlation' => # correlation information either supplied by FIDO or ported # from the 'test_supplied_correlation' fields. The correlation # came from FIDO intratest correlation if 'test_supplied_correlation' # is not defined. 'item' => [string] # parent object that this item is correlated to 'reason' => [string] # the reason the object was correlated 'descr' => [string] # a textual description of the item derived from # dns or from device/interface/ip information as appropriate 'device' => [string] # the related device as determined by correlation or supplied by the test 'event_id' => [string] # event_id of this event. 'failures' => [integer] # how many consecutive failures this item has suffered 'file' => [string] filename the object came from
       'help' => {
             '$helpfile_name' {
                 '$helpfile_name_source' # ip, rule number
              }
        }
        'holddown' => {
               matches = [hash]
                   # describes why the item in on holddown
               reason = [string] 
               time = [integer]
                   # time in minutes to hold down alarm based on alarm start time
               absolute_time = [string]
                   # time in text to hold down alarm.
        }
  'impact' => holdback to using Clarify on campus; 'reason' = [string] # optional 'value' = [integer] # 1 = top priority, # 2 = normal priority, # 3 = low priority, # 4 = informational
       'incomplete' => {
             # for node alarms, which subtests failed?
             # example:
             #     'entSensorValue' => '1',
             #     'ifAdminStatus' => '1',
             #     'ifOperStatus' => '1'
        },
       'info' => more info about the source of an alarm.  For example, a value of 'source: snmp domain  ', implies that the IP was discovered via SNMP and via DNS [some IPs get included in monitoring via AXFR + parsing ]
       'interface' => [string]
           # the related interface as determined by correlation or supplied by the test
       'ip' => [string]
           # the related IP as determined by correlation or supplied by the test
       'iso' => [integer]
           # used by the correlation engine.  The lower the value, the more important to the correlation engine.  
           # example, ip=3, service=4, so service is correlated to IP, not vice versa.       
       'needed_failures' = 
               # number of consecutive poll failures before the object shows in the gui

       'rtt' = integer
               # supplied by some tests to describe how long it took to verify this particular alarm. 
       'start' => [integer, ticks] # the time this object was first noticed.
'start_text' => [integer, ticks] # the time this object was first noticed, in text. This is so that # the GUI can search for this data.
       'state' = [string]
                # state of the tested object
'status' => [string] see fido.status.dat; represents the status of the reporting daemon, not the alarm itself.
       'subnet' => '144.92.26.130/25',
       'suppressed' => [integer]
              # number of suppressed objects.  See: FIDO: Object Suppression 
       'test' = [string]
                # the type of failure, displayed in the 'type' column, often used for correlation
'time' => [integer, ticks] last time this item was updated
       'time_of_day' => {
               matches = [hash]
                   # describes why the item in on holddown
               actual_match = [string]
               reason = [string] 
               time = [string]
                   # time in text to hold down alarm.
        }
'ts_device' => [string] # like 'device', but supplied by the test instead of calculated by FIDO.
       'updated' => [boolean]
           # whether or not the object has been updated during the most recent
           # polling period.  This is used to determine if an object should be
           # removed from the report file.  The item is removed if the item
           # itself isn't updated but the file it came from has been. 

       'via_connected' => [boolean]
           # describes if the device 'key/value' pair for this alarm was derived due to upstream network topology correlation
           # ie, icmpv4 fail of host 144.92.9.2/24 where 144.92.9.2 is -not- associated with a device would list the gateway device under 'device'





Keywords:FIDO: File Formats   Doc ID:9171
Owner:Michael H.Group:Network Services
Created:2009-02-17 19:00 CDTUpdated:2015-02-02 11:48 CDT
Sites:Network Services, Systems & Network Control Center, University of Wisconsin System Network, WiscNet
Feedback:  1   0