DoIT Operational Framework – Section 6.0 - Event Management

6.1 Background

Purpose – Event Management is the procedural framework by which event monitoring is organized and executed within DoIT.

Objectives:

  1. Establish and document the monitoring framework we use for DoIT and the campus subscribers that use DoIT services.
  2. Describe our event management process, specifically, its process methodology, the collection and use of data collected via the monitors, and the integrations of event monitoring with other ITIL processes.

6.2 Roles and Responsibilities

Technical & Application management

The technicians (such as Systems Engineering (SE)) and developers (such as Academic Navigator or Data Resource Management Technology (DRMT)) may identify monitoring requirements. They may also actually generate events from their analysis applications (such as Nagios or Oracle Enterprise Manager) that, in turn, are used to create Event Monitoring events.

IT Operations Management

Duty Manager, Managers that are On-Call on a rotating basis. When situations occur that exceed the scope of documented Event Management procedures, or simply are of a significant enough impact to warrant higher management attention, the Duty Manager is called to provide guidance.

Systems Network Control Center (SNCC)

the 24X7 staff that take action for problems requiring elevation. They create the problem entries in WiscIT for those automated events they receive on their Consolidated Console, their FIDO console, and problems transferred to them from the Help Desk.

Event Management and Monitoring Team

Developers and Administrators of the enterprise Event Management applications.

6.3 Event Management Framework

As defined in Section 2 of the Operational Framework, events are a change of state that has significance for the management of an IT service or other configuration item (CI).   At the lowest level, events provide information to help manage the day to day operation of IT services.  We will not discuss in this section the role events play at this lowest level as this is an operations management issue.  The event processes identified in this section are focused on the events that have a higher likelihood of indicating an incident and/or problem.  This may range from 1) simply sending an automated e-mail to increase awareness of an event, to 2) adding direct contact notifications for an event, to 3) invoking the Incident/ Problem process procedures of Section 4.0 of the Operational Framework.

6.3.1 Event Sources 

Events can originate from a variety of sources:

  • Dedicated event monitoring applications like Micro Focus Virtual User Generator (VuGen) or agent software of the event management server
  • Operations management systems like Nagios XI or Oracle Enterprise Manager
  • On demand cloud computing platforms via Rest APIs

6.3.2 Event Format for WiscIT   

All monitoring events should have a reference Configuration Item (CI) in the Configuration Management Database (CMDB). The CI entry documents the support and notification information for the monitoring event and is the official reference source for the SNCC to handle those events that are elevated to their Consolidated Console view.  Events elevated to WiscIT will include such details as addressees for e-mail notifications, level of notification required (that is, e-mail only, direct contact during working hours only or, direct contact 24x7), and event handling instructions by SNCC operators.  The CMDB CI also contains a tab listing all the events received for that CI for archive reference. 

6.3.3 Event Preprocessing before WiscIT   

WiscIT is the main ITSM application which processes elevated events but it requires preprocessing through the event management server.  This server does the following:

  • Collects events from events sources via its own agent, rest API, or e-mail.
  • Ensures the event has a valid CI record ID from the WiscIT CMDB.
  • Buffers events if the WiscIT application is down.
  • Filters out event updates that may be procedurally relevant for preprocessing but not operationally significant important for elevation.
  • Suppress event storms from impacting WiscIT
  • Simplify event correlation for WiscIT (i.e., automating the process to have a dashboard display only the most recent change of state of an event where possible)Flow diagram for Event handling

 6.3.4 Event Handling In WiscIT

The WiscIT application receives events from the event management server via a Rest API interface.  The event is placed in an event table in WiscIT and handled according to the information from the reference CI.   Regardless of any other handling, this event will be viewable thereafter from the event tab of its assigned CI.  Nearly all events will also have an e-mail generated upon arrival sent to the Primary and Secondary administrators identified in the CI record plus any stakeholders in the CI stakeholders field identified to receive “Changes and Monitoring.”  If the event is so deemed by the criteria in section 6.4, it may be elevated to the SNCC consolidated console for action by the SNCC staff per Incident/Problem management guidelines and additional instructions as specified in the Support tab of the event’s CMDB CI.

6.4 Event Management Requirements Process

Requirements for monitoring are officially submitted via Service Support Initiation or a submission of a WiscIT Monitoring Change Request (see Event Management and Monitoring Request for DoIT Supported Services (Procedure)). Other Event Management inquiries and requests may be handled by direct e-mail to the Event Management and Monitoring Team (monitoring@doit.wisc.edu). 



Keywords:
operational framework event management monitoring 
Doc ID:
43867
Owned by:
Sarah M. in ITSM
Created:
2014-10-07
Updated:
2023-06-09
Sites:
DCTeam, DoIT Continuity of Operations Plan (COOP), DoIT Help Desk, DoIT IT Service Management, DoIT Staff, Event Management and Monitoring, Network Services, Systems & Network Control Center, Systems Engineering, Systems Engineering and Operations