UW Service Center/DoIT - Mainframe Outage Communications Plan
In the event of a mainframe outage, the communications pathways in the linked diagram are the desired state for handling communications on the outage.
This plan is about "closing the communications loop" with respect to Impact One incidents. That means establishing a clearly defined process to meet the three most important elements of successful incident management: knowing how and to whom to report incidents once they’re discovered; notifying/mobilizing the resources required to resolve the incident as quickly and efficiently as possible; and reporting the status of the incident to the full complement of stakeholders.
The diagram is divided into three layers: an End User or "outward facing" layer. A Technical or "inward facing" layer. And an Incident Management or "Communications Hub" layer.
The End User layer
As a group, UW-Madison, the UW Service Center, and UW System are essentially the only entities in direct communication with the Incident Management or Communications Hub. This direct communication will primarily take the form of the DoIT Help Desk (via phone, web page, an email to the Impact One mail list), the DoIT Service Outage web page, a broadcast news item, or the Infra system. Note that the Service Center itself may act as a Communications Hub for UW-Madison, UW System and even users external to UW.
How this layer reports incidents: Notify UW Help Desk via phone, email, or web site.
How this layer learns status of incidents: Madison, Service Center and UW System would use DoIT Help Desk, DoIT Service Outage Page, broadcast news item. While external entities would rely on Service Center and DoIT Help Desk (Outage Page).
The Incident Management layer
The DoIT Help Desk and SNCC (Systems & Network Control Center) work very closely together and each will be informed of the existence and nature of an Impact One incident.
Ideally, The Help Desk will be notified of all Impact One incidents discovered by the End User layer. And all "official" DoIT communications will be delivered via Help Desk (through the Outage Page, a broadcast announcement, or the Impact One mail list).
How this layer reports incidents: SNCC will notify Help Desk and Technologist layer via phone or chat. Help Desk will notify SNCC via phone or chat, and will notify the End User layer via Outage Page, broadcast news item, Impact One mail list, and Infra.
How this layer learns status of incidents: Help Desk receives status from SNCC. SNCC receives status from Technology layer.
The DoIT Technologists layer
This layer contains the various DoIT groups responsible for maintaining services from a technical standpoint (programming, infrastructure, etc.). Ideally, Impact One incidents discovered by the other layers will reach this layer only via SNCC.
How this layer reports incidents: Techs should report incidents to each other AND to the Incident Management layer (SNCC, minimally)
How this layer learns status of incidents: From other entities within the same layer, or from SNCC. Again, information (nature of the issue, expected uptime, etc.) MUST always be shared with Incident Management.
- The top layer can be considered "outward facing" constituents. Generally speaking, Help Desk is responsible for "outward facing" communications.
- The bottom layer can be considered "inward facing" constituents. Generally speaking, SNCC is responsible for "inward facing" communications.
- The middle layer is the "Communications Hub". Ideally, this layer should be made aware of ALL Impact One incidents as soon as they are discovered. Also, all “official” outward facing DoIT communications should originate from the middle layer.
- Any layer can discover an Impact One incident. Regardless, once discovered, the middle layer (DoIT Help Desk and SNCC) should be notified first.