DoIT Operational Framework - Section 8.0 - Continuity of Operations (COOP)
This Continuity of Operations Plan is for internal DoIT use and UW-Madison Police Department use only. Please contact the COOP Coordinator for a redacted version if needed for any other purpose.
The DoIT Continuity of Operations Plan (COOP) (the Plan) has been developed to protect the well-being of staff, students, and visitors and to mitigate risk to the continued operations of essential IT services and IT functions under disruptive emergency conditions for both campus and DoIT.
In conjunction with expertise from the University of Wisconsin - Madison Police Department (UWPD), the Plan provides DoIT with a framework for business continuity and establishes guidance for essential IT service execution in the face of a natural or human-caused threat to or major disruption of IT Service operations.
I. Introduction
The DoIT Continuity of Operations Plan (COOP) (the Plan) has been developed to protect the well-being of staff, students and visitors, as well as to mitigate risk to the continued operations of essential IT services and IT functions under disruptive emergency conditions for both campus and DoIT.
In conjunction with expertise from the University of Wisconsin–Madison Police Department (UWPD), the Plan provides DoIT with a framework and guidance for business continuity in the face of a natural or human-caused threat or major disruption of IT service operations for any reason.
A. Executive Overview
Executive Overview
The Plan includes policies, practices, and procedures to ensure the continuity of DoIT operations before and after disruptive emergency conditions. Provisions of the Plan include:
» Emergency procedures to follow in the event of a major disruption:
-
- Event detection
- Human resource safety and support
- Property protection and security
- Communications and overall direction of recovery procedures
» Organization of teams to manage response and recover IT Service operations
» Prioritization criteria for the recovery of IT services - Service Tiers aka COOP Tiers
» Framework for support and recovery of all systems, applications, data and network services after a disruptive emergency
The Plan applies the designated prioritization criteria of Service Tiers (also known as COOP Tiers) to establish DoIT service priorities. These service priorities form a basis for identification of service needs and resolution goals as defined in the DoIT Operational Framework – Section 4 – Incident Management in Section 4.2 - incident Response Guidelines, capabilities for meeting these goals, and any associated gaps. Remediation of gaps, including prioritization of projects and associated funding, are a responsibility of DoIT leadership in consultation with DoIT governance groups and partners/customers.
Throughout this documentation, the terms COOP, and Business Continuity Plan and the Plan are used interchangeably.
B. Audience
Audience
The primary audiences for the Plan include DoIT management, technologists, and business administrators responsible for ongoing plan implementation, maintenance, testing, training and activation. Secondary audiences include customers of DoIT products and services, University administration, University Police responsible for Continuity of Operations Planning (COOP) and internal auditors.
C. Purpose
Purpose
DoIT provides a wide range of technology services which are critical to the successful operation of the University. It is essential that DoIT ensures the availability and reliability of critical IT services within its purview.The goals of the COOP are to:
» Provide for the safety of DoIT employees, students, and visitors in the event of an emergency
» Establish infrastructure for communications, command center, and alternate site work for DoIT, where needed, to support timely IT service recovery
» Focus on the Plan procedures necessary to maintain or resume IT services within reason and priority, thereby minimizing impact to campus operations and end users
» Prepare and provide advanced information and education to DoIT employees regarding their roles and responsibilities following an emergency disruption of DoIT operations
» Inform DoIT leadership for emergency IT service management and coordination
» Protect hosted data should an emergency disruption of operations occur
» Protect and minimize the potential loss of property, assets and resources during an emergency disruption
During an emergency, it is critical that DoIT prioritizes its activities to focus on the safety and welfare of its personnel while providing continuity of essential DoIT IT Services and high-risk data. The prioritization of essential IT functions may require delaying the restoration of non-essential IT functions. Services deemed essential IT functions may vary according to the nature of the disruptive emergency and campus needs.
The Plan delineates roles, responsibilities, and processes to be followed immediately after a major IT service disruption. The Plan is designed to be as threat-independent as possible, while at the same time allowing for flexibility when threat-specific response is needed.
All DoIT service management data pertinent to COOP is exported quarterly to an offsite secure cloud location. Critical information, such as the most current prioritized list of DoIT services, key vendor information and staff contact information is stored in three locations:
» In the DoIT Cherwell IT Service Management application branded as WiscIT
» in a cloud storage location, on BOX
» on USB flash drives, distributed to key roles
D. Applicability and Scope
Applicability and Scope
This overarching COOP documents a broad recovery framework for DoIT services and associated infrastructure. The scope of this Plan extends to any major event that threatens or disrupts normal DoIT service operations, DoIT data centers, network supernodes, and other critical IT infrastructure including critical IT services provided by 3rd parties, regardless if the disruption is man-made, technological infrastructure or natural disaster.
While the COOP framework may be used as a model for identifying disaster recovery requirements for each specific DoIT service, individual DoIT IT services may have business continuity plans specific to their areas. Recovery plans for each specific IT service are the responsibility of IT service owners, project managers and technical team leads. These should be referenced for more detailed information about respective operations.
Not all disruptions are managed through this Plan. Problems with a likely IT root cause impacting DoIT services are managed by the DoIT Operational Framework Problem Management procedures. Major disasters that have a broad impact on the campus at large, including health, safety, or order, are covered by UW-Madison Emergency Management procedures.
IT Service Problems which potentially jeopardize staff safety, physical access to or structural integrity of the data center facility, physical integrity of equipment, or major power outages of extended duration may constitute a need to activate the DoIT Continuity of Operations Plan.
E. Assumptions
Assumptions
The DoIT Plan is based on a realistic approach to problems likely to be encountered during a major disruption or emergency. The assumptions listed below should be used as general guidelines in such an event.
- TIMING - An emergency or a disaster requiring activation of the COOP plan may occur at any time of the day or night, weekday, weekend, holiday, with little or no warning.
- VARIABILITY - It is assumed that every emergency is different, and a document such as the COOP Plan may not be able to be followed to the letter for each and every COOP activation. The Plan is meant as a guide for the UW-Madison Division of Information Technology.
- NOT PREDICTABLE - The succession of events in an emergency or disaster is not predictable; therefore, published operational plans, such as this plan, should serve only as a guide and a checklist.
- DEVELOPING CONDITIONS - An emergency or a disaster may be declared if information indicates that such conditions are developing and probable.
- COMMUNITY - Disasters may be community-wide. Therefore, it is necessary for DoIT to plan for and carry out disaster response and short-term recovery operations in conjunction with other campus and local resources.
- COMMUNICATION TOOLS - Communication and collaboration tools are available
- ELECTRICITY - There are still some power resources available for IT infrastructure on the campus.
- SERVICE LOSS - There is a significant or total loss of one or more of the essential services or infrastructure of DoIT
- ALTERNATE SITE - One of the DoIT COOP alternate operation facilities is functional.
- STAFFING - There is a significant, not total, loss of DoIT staff. DoIT staffing levels remain high enough to continue core essential IT services.
- NETWORK - There is a significant, not total, loss of the DoIT administrative or information technology networks.
- ASSISTANCE TO CAMPUS - The major role of DoIT in a COOP activation would be to assist and aid other campus and potential community groups in the event of an emergency. DoIT manages IT resources that can assist in campus disruptions.
- FACILITY AVAILABILITY - Most of the facilities on campus have not been affected by the disaster.
II. COOP by Phase
COOP - Process Overview
Response activities to an emergency disruption can be classified in four phases.
Phase 1.0 - COOP Activation
PHASE 1.0 - Activation

The Activation phase includes procedures for disruption impact assessment, COOP activation, notification, leadership and initial actions. The following provides information for determining when and how to activate the Plan and provide immediate guidance to relevant parties.
DoIT COOP activation scalability
COOP activation is scalable. The extent of Plan activation depends on the scope of the disruption. In some cases, only specific continuity personnel may be activated to perform the tasks associated with impacted IT services, or all personnel that are telework-capable may be activated, while work that necessitates being onsite continues. In other cases, the disruption may warrant a full activation, where all personnel must conduct their activities offsite. Each COOP activation will be disruption-specific according to the impact or potential impact.
Levels of COOP activation are provided in Table 1.
- Table 1: Levels of COOP activation
-
Table 1: Levels of COOP Activation Level of Activation
Action
Examples
Partial Activation
Partial activation may focus on
» Assessing, servicing, restoring, or supporting impacts to building infrastructure or equipment, such as generators or back-up power
» Activation of Lines of Succession in the event that Continuity Personnel are on vacation, sick, or otherwise unavailable
» Impacts to a portion of a DoIT data center that do not impact the health or safety of building occupants
» Availability of IT resources or other IT related activities, such as retrieving back up files from other sites, or relying on manual documentation
» Impending winter weather event
» Loss of campus internet or network access
» Continuity Personnel are unavailable due to business travel
Full Activation
» Complete evacuation of a DoIT office building or facilities closure to non-continuity personnel or the implementation of lines of succession
» Events requiring containment, decontamination, and risk communications
» Long-term impacts to DoIT building infrastructure
» Significant events on or off campus resulting the unavailability of large numbers of staff
» Large-scale influenza epidemic
» A Hazardous Materials event
» Building fire resulting in significant damage
DoIT COOP coordination in multi-tenant buildings
DoIT will coordinate with other occupants of DoIT buildings regarding activation of their respective COOP. In situations where there is no impact to health and safety of buildings that DoIT occupies, DoIT may continue normal operations, or conduct a partial activation.
Notification
Initial notification procedures are critical to personnel safety and orderly response to a major problem. The Senior Systems and Network Control Center (SNCC) staff on duty, in consultation with the SEO Duty Manager and the Systems Engineering & Operations (SEO) Director, are responsible for following data center notification procedures as described above in section 1.1.1 of this DoIT COOP document.
Order of Notification is detailed in Appendix C: Order of Notifications in this document.
If an immediate danger exists within a DoIT building, a member of the Incident Management team will immediately notify the Building Manager that an incident has occurred, and action is being taken to respond to the incident. Building occupants should be directed according to hazard specific procedures outlined in the UW-Madison Police Emergency Procedures Guide. Note that certain incidents (e.g. a fire or active shooter event) may require immediate response/direction.
Incident leadership and command center
When a physical location for managing incidents is required, the Executive Management Team (EMT) designates a gathering space, or “command center”, where the team can coordinate activities - see Table 2: Command center locations. The command center should have available all necessary supplies, maps, key contact information, telephone lines and computers. The command center should also contain stand-alone copies of institute emergency policies and procedures.
-
Table 2: Command center locations Command Center Location
List of Stored Supplies
Primary Location
DoIT location, communicated to responding staff, if secured from major disruption
» Facility Maps
» EOP or Evacuation Policies and Procedures
» Laptops and telephone access
» Staff contact information
Secondary Location
Office suite in recovery site building
» Facility Maps
» EOP or Evacuation Policies and Procedures
» Laptops and telephone access
» Staff contact information
Orders of Succession
During COOP operations, if members of the Executive Management Team or Incident Response team are incapable or unavailable to fulfill essential duties, successors have been identified to ensure there is no lapse in essential decision-making authority. For more information see Appendix A: Orders of Succession.
Delegation of Authority
Members of the Executive Management Team or Incident Response team leaders have the authority to make critical decisions regarding response or continuity operations. Executive Management Team or Incident Response team leaders can designate individuals or positions, and their successors, to make these decisions in their stead if necessary. For more information see Appendix B: Delegation of Authority.
Communications
Communications and initial notification procedures are critical to personnel safety and orderly response to a disruptive emergency.
For all detailed procedures, see
- Appendix C: Order of Notifications
- Checklists: Executive Management team/Communications Consultant checklist in this plan.
Delivering messaging during a COOP activation
During a COOP action, continuity personnel providing messaging must
- Ensure that all external communication occur through the DoIT Communications Director or delegate
- Coordinate communication activities at facilities with UW Police, campus, DoIT Operations, and Cybersecurity
- Ensure internal and external communications are accurate, timely, and informative
- Provide frequent updates to staff and building occupants to mitigate concerns and manage expectations
- Share only known/confirmed information (i.e., do not speculate)
- Use one unified voice to avoid confusion or misinformation
DoIT Communications team should provide guidance to Staff on where to direct incoming inquiries from personnel, media, vendors, etc. Incoming inquiries should be tracked so that they can be contacted for follow-up.
Notifications regarding a possible physical relocation will be sent to all continuity personnel by the Communications Consultant. Supervisors and managers will provide information to non-continuity personnel regarding relocation or possible long-term work shortages at the guidance of the Executive Management Team and Human Resources team.
Public relations
The DoIT Communications Director, in consultation with the Executive Management Team, is responsible for public relations and all communication to the public during tan active disruption. DoIT COOP Communications Director/Consultant works in collaboration with the UW Emergency Management Unit and UW Communications. DoIT Communication can utilize the DoIT Outage Communications checklist located offsite on BOX for general guidance.
Pre-Scripted Messaging
DoIT Communications maintains templates of pre-scripted messages to be tailored and utilized as appropriate when DoIT COOP is activated. Communications media may include but is not limited to: UW–Madison emergency web pages, the UW-Madison landing page, the Outages pages, email, Facebook, Twitter, and emergency text messaging through the RAVE system.
Internal Forms of Communication
Communications internal to DoIT may vary depending on technologies available and best suited to the emergency disruption. These may include but are not limited to: MS Teams’ DoIT Operations channel, Google, Out-of-Band chat rooms, phone bridges, telephone and cell phone communications, message relays via telephone trees, conference calls, texting, email, and staff messengers (“runners”) when physical presence does not risk staff life/health/safety.
COOP activation procedure detail
Table 3: COOP Phase 1.0 - Activation Procedures COOP PHASE 1.0 - Activation 1.1 Identify disruption as a potential COOP activation
Responsibility
1.1.1 Follows standard procedures in responding to a major disruption:
- Detects major disruption(s) directly or via monitoring devices or alerts.
- Assesses extent of major disruption. If emergency services are required:
- Calls (9) 911 for emergency services and physical security.
- Calls 3-3333 to arrange for Physical Plant facility services.
- Evacuates self from the facility if the situation threatens physical safety. At no time should the physical safety of employees be jeopardized.
- Building occupants should be directed to hazard specific procedures outlined in the UW-Madison Police Emergency Procedures Guide.
- Manages emergency shutdown of services if possible and necessary.
- Contacts SEO Duty Manager with a preliminary situation report.
Senior Systems and Network Control Center (SNCC) Staff Member on duty
1.1.2 Assesses personnel safety and resource needs
- Initiates accounting for all personnel who were at any affected DoIT sites during any major disruption jeopardizing human safety and notifies Human Resources.
- Establishes contact with DoIT leadership and/or staff at other DoIT sites regarding activation of their building’s respective COOP.
- Establishes alternate communication channel if needed
- If MS Teams DoIT Operations channel is disrupted, initiates chat communication via OOB Google Chat Room at http://oob.wisc.edu
-
- If NetID authentication is disrupted, then log into MS Teams with non-NetID account at https://teams.microsoft.com used in OOB Chat drills.
- Follows SEO Vertical Escalation & Situation Manager procedure and engages responsible Situation Manager
- Informs UW Police if UWPD is not already involved
- Triggers the emergency SMS message to Core Services Directors
- SNCC includes “Emergency: meet now" in CS Director Teams Channel
- Note: Directors should program the SNCC number into their phone with Bypass Do Not Disturb mode so that it makes noise even if the phone is on silent.
- Engages SEO On-Call Technologist(s) per Office365 calendar for DoIT SEO On-Call)
SEO Duty Manager
1.1.3 Consults with SEO Director, if possible.
SEO Duty Manager and SEO Duty Technologist
1.1.4 Notifies the Building Manager that a major disruption has occurred.
SEO Duty Manager
1.2 Form Executive Management Team
Responsibility
1.2.1 Calls for an assembly of Executive Management Team (EMT) members at DoIT Emergency Operations Center (EOC).
-
Core Services directors and Deputy CIO immediately convene a call, discuss the situation, and determine who is best suited to perform/assume the DoIT CIC role in the given situation.
-
DoIT CIC will notify CIO Exec team of the situation and let them know who is filling the DoIT CIC role.
- For situations involving physical locations, and when safety permits, DoIT CIC will travel to the impacted location.
-
In the unlikely event that none of the deputy CIO or CS directors are available, DoIT CIC will transfer to the ITSM Associate Director, who will need to communicate directly with CIO Exec Mgmt team. The ITSM Associate Director will hand DoIT CIC back when a CS Director or Deputy CIO becomes available.
- DoIT EOC location
- may be virtual, via Secure MS Teams or Google room EOC Channel per the Out-of-Band (OOB) chat procedures, or via other unanimously-agreeable medium.
- will be located at specific DoIT location, if secure
- Alternate location: office suite at the recovery site.
SEO Director, any Core Director, or designate
1.2.2 Contacts Building Manager at recovery site for appropriate access to the building and action is being taken to respond to the major disruption.
SEO Director with SNCC Manager
1.3 Gather Information and Analyze Conditions
Responsibility
1.3.1 Activates damage assessment procedures
SNCC Manager or SEO Director
1.3.2 Leads team to assess impact to IT services
SNCC Manager or SEO Director
1.4 Establish COOP command
Responsibility
1.4.1 Notifies Deputy CIO of possibility that DoIT COOP Incident Commander (CIC) role may be needed..
SEO Director or designate
1.4.2 Establishes a physical or virtual command center
DoIT COOP Incident Commander (CIC)
1.4.3 Summons the Executive Management Team
DoIT COOP Incident Commander (CIC)
1.4.4 Ensures the engagement of the Communication Consultant.
Executive Management Team (EMT)
1.4.5 Activates Human Resources team to support staff and families.
DoIT COOP Incident Commander (CIC)
1.4.6 Notifies Insurance team lead and campus Risk Management.
DoIT CFO or Financial Services
1.5 Determine COOP activation Level
Responsibility
1.5.1 May take recommendations to DoIT CIC on partial or full COOP activation
SNCC Manager or SEO Director
1.5.2 Activates COOP and determines level of COOP activation. See criteria for COOP activation level in Table 1 above. Levels of COOP activation include:
» Partial Activation: is a scaled COOP activation that aligns with the scope of the continuity event and the impacts to DoIT essential IT services. Partial COOP activation should include activation of specific continuity personnel to complete their continuity related responsibilities, though the full continuity team should receive notification of the partial activation. The DoIT Executive Management Team will determine if additional notifications to DoIT management or staff are necessary.
» Full Activation: includes notification and activation of all continuity personnel to complete responsibilities identified in operation of essential IT services, as well as notification of the DoIT Executive Management Team and communication with all non-continuity personnel of current actions and priorities.
1.5.2.1 Full COOP activation for Class 1. Class 1 includes
» Facility damage or facility access,
» Extensive or potential physical damage and/or danger, extensive or medium duration.
» Examples: fire, flooding, explosion, terrorist threats, severe weather, train derailment, pandemic, hazard material event, building fire resulting in significant damage
1.5.2.2 Partial COOP activation for Class 2. Class 2 includes:
» Power outage, minimal physical damage or danger, widespread hardware or software attack from computer virus or hacker, potentially extensive loss of multiple IT services. Medium duration
1.5.2.3 No COOP activation for Class 3 – use DoIT Problem Management process instead. Class 3 includes
» Localized hardware or software attack from computer virus or hacker, potentially loss of IT services, medium or short duration
DoIT COOP Incident Commander (CIC)
1.5.3 Decides the situationally-appropriate medium for DoIT internal communications
DoIT COOP Incident Commander (CIC)
1.5.4 Will coordinate with other occupants of DoIT buildings regarding activation of their respective COOP.
SNCC Manager or SEO Director
1.6 Form COOP teams
Responsibility
1.6.1 Activates the Incident Response teams appropriate for the disruption and ensures that teams are properly staffed, with consideration given for preparing/pipe-lining “next shift” staff and for staff nutrition
Executive Management Team (EMT)
1.6.2 Based on IT service impact of the disruption and COOP activation level, DoIT COOP Incident Commander (CIC) forms
» IT Service Recovery team
» Administrative team
1.6.3 The Incident Management team will initiate response actions that are appropriate to the severity of the incident.
-
Phase 2.0 - COOP operation
PHASE 2.0 - Operation
This section discusses the necessary actions, parameters and considerations following the COOP activation phase.
Duration
Continuity of operations covers the span of 12 hours after the incident up to 30 days while essential IT functions are being restored and conducted.
Execution of essential IT functions
Once the Plan has been activated and all personnel have been notified of their roles and responsibilities, assigned staff will commence continuity operations to deliver essential functions. Essential IT services must be maintained with little to no interruption. IT services which have been considered essential in previous COOP activations include services listed in Table 4.
Name |
Definition |
---|---|
Help Desk |
Help Desk services should expect increased volumes during COOP activation to support online learning tools, VPN, etc. |
SNCC |
SNCC should expect to be considered essential staff during COOP activation. SNCC provides communication between UWPD, technical staff, DoIT Management, and the general campus via the Outages page. |
Critical Infrastructure/Life Safety |
Critical infrastructure/Life Safety services are paramount during COOP activation |
Canvas and other LMS tools |
Canvas and other LMS tools are essential during COOP activation to support he overall mission of the University |
Cybersecurity |
Cybersecurity is mission critical to maintain the safety, integrity and availability of University data during a COOP activation. |
NetID Login authentication |
NetID authentication enables protection to UW–Madison IT infrastructure and data. |
|
Email is considered an essential IT service due to heightened communication needs during COOP activation. |
Email Lists |
Email Lists are considered an essential IT service due to heightened communication needs during COOP activation. |
DoIT OnCall Rotations |
Staff from all DoIT on-call rotations (in SE, ITSM, A&SE, Data Center, AIS, NS, US) could be required during a COOP activation. |
UW-Madison Police Department (UWPD) |
UWPD notification should be considered essential for any DoIT COOP activation. |
For reference, a spreadsheet of DoIT Essential Employees was developed in March 2020.
Alternate individuals have been identified as support to assist with maintaining essential IT services if the primary responsible party is unable to fulfill their duties. This support information is stored in WiscIT.
Execution of other IT services
Each DoIT group will continue operations to the extent possible at their designated continuity facility. DoIT groups that can perform their works remotely should implement appropriate telework procedures.
Communications
The DoIT Communications Director and Communications team will continue to manage the activities outlined in Phase 1.0 - COOP activation.
COOP Operation Procedure Detail
COOP PHASE 2.0 - Operations |
|
|
---|---|---|
2.1 Commence continuity operations by assigned staff to deliver essential functions |
Responsibility |
|
|
2.1.1 Contact team members & log contact » Contact team members and maintain COOP Log of contacts made, in Appendix H. » Set response operation periods and objectives (typically 12-24 hours but potentially less depending on the scale of the event). Set objectives at the outset of each operational period. |
Incident Response team leaders |
2.2 Prepare teams |
Responsibility |
|
|
2.2.1 Prepare teams: » Contact team members and maintains Log of contacts made » Assemble teams at suitable locations. » Report situation. » Review team responsibilities and functions. » Prioritize and direct next actions. |
Incident Response team leaders |
2.3 Communicate with key stakeholders |
Responsibility |
|
|
2.3.1 Communicate with stakeholders |
Incident Response team leaders |
2.4 Establish operations from continuity facility |
Responsibility |
|
|
2.4.1 Each DoIT group will continue operations to the extent possible at their designated continuity facility. DoIT groups that can perform their works remotely should implement appropriate telework procedures. If continuity requires relocation to an identified continuity facility (see the DoIT Offsite Alternate Work Location Info) or other facility, the following systems and documents may need to be available to ensure continuity personnel can maintain communications and access essential records and information: » Access management » A local area network » Internal and external email and email archives » Both electronic and hard copy versions of essential records (stored off site) Supervisors and managers will provide relocation information or dismissal instructions to non-continuity personnel at the guidance of the Incident Response team |
DoIT groups |
|
2.4.2 Notify staff of alternate facility » Notifications regarding the relocation to an alternate facility will be sent to all continuity personnel by the Communications Consultant » Supervisors and managers will provide information to non-continuity personnel regarding relocation or possible long-term work shortages at the guidance of the Executive Management Team and Human Resources team |
|
|
2.4.3 Identify staff rotations » Alternate individuals have been identified as support to assist with maintaining essential IT services if the primary responsible party is unable to fulfill their duties. This information can be found in WiscIT |
|
2.5 Continue to monitor major disruption impact on essential IT services |
Responsibility |
|
|
2.5.1 When disruption or threat has been eliminated, initiate RECONSTITUTION phase |
DoIT CIC |
Phase 3.0 - COOP Reconstitution
PHASE 3.0 - RECONSTITUTION
Reconstitution is the process of terminating Plan operations and resuming all essential functions and other activities carried out by DoIT. Reconstitution operations may include:
- Deactivating the continuity or alternate facility.
- Returning equipment, records, and personnel to either the original or a replacement primary site.
- Returning to normal operations.
Planning for reconstitution should occur as soon as continuity operations are activated. Reconstitution will commence as soon as the emergency incident concludes. Note that in certain cases the facility may sustain serious physical impacts, and there may be a significant delay for repairs. In this event, the institute may reopen in a phased manner consistent with the ability to establish essential functions. The following section will serve as a guide to prepare for reconstitution operations.
COOP reconstitution procedure detail
COOP PHASE 3.0 - Reconstitution |
|
|
---|---|---|
3.1 Commence reconstitution phase |
Responsibility |
|
|
3.1.1 Coordinate teams to terminate COOP operations and begin reconstitution by sending notification to continuity personnel. |
Incident Executive Leadership team and Incident Mgmt team |
|
3.1.2 Appoint a reconstitution team |
Incident Management team |
3.2 Validate facility safety for staff return |
Responsibility |
|
|
3.2.1 Prior to re-entering any DoIT-occupied Building, the Executive Management Team will work collaboratively with the Facilities team and UW-Madison FPM to ensure that actions are completed appropriately, and that personnel safety is not at risk. |
Incident Response team leaders |
|
3.2.2 In the event the facility sustains severe physical impact, DoIT may reopen in a phased manner consistent with the ability to establish essential and prioritized IT services |
|
3.3 Develop reconstitution plan |
Responsibility |
|
|
3.3.1 Reconstitution team develops a detailed move plan to ensure the orderly return to the normal operating facility or move to another operating facility. |
Incident Response team leaders |
|
3.3.2 Assign a team to handle final preparations at site. This team should develop a checklist of areas to be inspected and verified before the move (i.e., space configurations, proper functioning of equipment and PCs, heat/cooling, electricity, telephone and computer connectivity, etc.) |
|
|
3.3.3 Conduct an assessment to determine if any validation tests are necessary. |
|
|
3.3.4 Notify Personnel of Reconstitution |
|
|
3.3.5 Follow procedures to ensure a timely and efficient transition of communications, direction and control and transfer of vital records and databases to primary facility, adhering to data handling and security protocols per UW System procedure 1031.B. for High Risk Data. |
|
|
3.3.6 Arrange to have any necessary supplies and equipment moved from alternate site |
|
3.4 Resume operations at permanent site. |
Responsibility |
|
|
3.4.1 Each DoIT group will continue operations. |
DoIT groups |
|
3.4.2 Develop a plan for returning the alternate space being used to its normal occupants.
|
DoIT groups |
3.5 After Action Report (AAR) |
Responsibility |
|
3.5.1 Assemble key participants to review COOP process and/or After Action Report (AAR) for process improvement |
DoIT COOP Coordinator |
|
Close COOP |
|
After all services are restored, DoIT COOP Coordinator completes and reviews the following checklists for post-problem analysis and Continuous Improvement review:
- Reconstitution process log
- Deactivation checklist
- After-Action worksheet
Phase 4.0 - COOP Readiness and Preparedness
PHASE 4.0 - Readiness and preparedness
DoIT will maintain a state of readiness through regular preparedness activities. This includes
- maintaining operational documentation such as the Plan
- reviewing and updating the Plan on an annual basis and after major disruptions,
- socializing continuity procedures among personnel
- training and exercising the Plan regularly with personnel via Tabletop Exercises (TTX)
- ensuring that IT service status and IT service component changes/inventories are maintained in the WiscIT Configuration Management Database (CMDB) as part of normal DoIT operations.
- Directors should program the SNCC number into their phone with Bypass Do Not Disturb mode so that it makes noise even if the phone is on silent.
Testing, Training and Exercises
DoIT staff’s understanding and familiarity with COOP content and procedures creates more efficient preparedness, response, and execution for maintaining essential IT service operations. For that reason, the Plan is exercised annually either
- during an unplanned interruption to IT services, or
- via a planned TTX with a specific scenario which involves a major disruption to IT services.
In each TTX, the DoIT COOP is validated and evaluated for possible improvements in COOP activation, responsible party COOP preparation, and COOP operation. Improvements are reported in the After Action Report (AAR) and once approved by DoIT Management, are tracked in WiscIT via the WiscIT Continuous Improvement Registry (CIR).
In conjunction with UWPD Emergency Management, the DoIT COOP Coordinator determines the annual TTX scheduling and strives for a time when DoIT leaders and staff can participate or delegate.
TTX scenarios are chosen and developed based on DoIT COOP, Appendix F - Threat, Risk, and Vulnerability Analysis.
Emergency management training
DoIT staff who have a designated role during COOP activation may consider formal training to familiarize key emergency disruption concepts and principles, such as the Incident Command System (ICS). FEMA offers several free online training courses.
III. COOP Roles and Responsibilities
Understanding COOP roles and responsibilities
The DoIT COOP Incident Commander (CIC) and continuity operations are guided, informed and executed by teams. Details on roles and teams are here.
A. DoIT COOP Incident Commander (DoIT CIC) role
DoIT COOP Incident Commander (DoIT CIC) role
DoIT executives with formal Order of Succession responsibilities for the DoIT COOP Incident Commander (DoIT CIC) role are:
-
- Primary: Deputy Chief Information Officer
- Secondary: Core Services Director as determined during activation by the EMT
- Tertiary: ITSM Associate Director
In the unlikely event that none of the deputy CIO or Core Services directors are available, DoIT CIC will transfer to the ITSM Associate Director, who will need to communicate directly with CIO Exec Mgmt Team. The ITSM Associate Director will hand DoIT CIC back when a CS Director or Deputy becomes available.
DoIT CIC specific responsibilities include:
» Internal policy level decisions
» Coordination of communications with UW-Madison campus officials and other executive authorities
» Coordination of public information and media contacts
» Communication with Human Resources director to ensure employees are notified and provided with any necessary resources/assistance
-
-
- DoIT facilities closures or relocation to alternate site(s)
- Actions that may result in the loss of intellectual or proprietary capital
- Fiscal authorizations
-
Delegation of Authority and Order of Succession for DoIT COOP Incident Commander
-
Handing off DoIT CIC
-
If the DoIT CIC role needs to be transferred to another (e.g. fatigue, conflict, etc.)
-
New DoIT CIC is determined by EMT, using the same process as above
-
Former DoIT CIC communicates to CIO Exec Mgmt Team and new DoIT CIC of the status of the situation, next steps, and make explicit the hand-off of DoIT CIC role and expected duration of the transfer
-
The emergency Delegations of Authority and formal Order of Succession are effective during disruptive emergency conditions when the DoIT Continuity of Operations Plan is actioned, and other emergency situations which disrupt normal IT service operations. Delegations of Authority and Order of Succession take effect when normal channels of direction are disrupted and terminate when these channels have resumed. To the extent circumstances permit, officials must document the beginning and end dates of their authority under this activation.
Reference DoIT COOP Incident Commander Responsibilities - Detail
B. DoIT Executive Management Team (DoIT EMT)
The Executive Management Team (EMT)
The Executive Management Team (EMT) and the DoIT COOP Incident Commander work closely to assess the situation, activate the plan, and implement operational continuity.
Executive Management Team (EMT)
EMT members These DoIT Executives who comprise the Executive Management Team
» Communications Director
» Human Resources Director
» Chief Financial Officer (CFO)
» Chief Information Security Officer (CISO)
» IT Service Recovery Team Director
» Chief Technical Officer (CTO) as Campus Liaison
» Administrative Team Director
» Core Services Directors
EMT responsibilities A principal responsibility for the EMT is to keep managers focused on the right set of priorities in emergency disruption conditions. EMT responsibilities include:
» Manage initial response
» Gather information and analyze conditions related to DoIT and throughout the University
» Allocate and direct distribution of resources to accomplish the purposes of the DoIT's COOP Plan
» Request needed resources from available outside sources if internal resources are not available
» Direct incident response teams
» Prioritize and resolve issues
» Assure communications with stakeholders
» Approve final plan and final policy decisions
Incident Response Functions of the EMT include:
» Coordinate response with campus emergency management procedures
» Initiate measures for the safety of lives and property
» Establish a physical or virtual emergency operations center
» Initiate damage assessment
» Determine extent that the DoIT Plan will be used and declare a full or partial COOP activation if warranted
» Address issues escalated through Incident Response team leads
» Report status to University administration
Reference: Executive Management Team details
C. Incident Response team
Incident Response team
Incident Response teams are responsible for the execution of the COOP Plan during an emergency disruption. Incident Response team members are expected to gather the required resources needed to implement IT service continuation and recovery efforts. Incident Response teams are activated to respond to any emergency situation at a level based on the type and nature of the disruption.
The Incident Response team is comprised of three teams bridging different functional areas
1.Executive Management Team (EMT)
2. IT Service Recovery team
IT Service Recovery Team Director role is filled by
- Primary: SEO Director
- Secondary: DoIT Core Services Director as determined by situation
IT Service Recovery team members include
- IT Service Recovery Team Director
- Application Services team
- Facility team
- Hardware team
- Help Desk team
- Infrastructure team
- Network team
- Operations team
- Cybersecurity team
- Physical Security team
IT Service Recovery team resources
DoIT maintains offsite on BOX several lists, extractions from WiscIT, and other resources for IT Service Recovery efforts
- Lists of Data Center Devices by Service (COOP) Tiers
- WiscIT Disaster Recovery Data Export information
- Agreements and Vendors data export list
- DoIT Emergency Contact List
- Data Center Supernode Cis - All
- Network Services COOP Plan
3. Administrative team
- Administrative Team Director
- Finance team
- Insurance team
- Logistics team
- Procurement team
D. Communications team
Communications team
-
Communications team director is
- Primary: DoIT Communications Director
- Secondary: Asst Communications Director
-
Responsibilities include
- Provide primary communications link between DoIT and customers
- Liaise with UW Communications
- Provide media relations
- Prepare messages to be disseminated by Help Desk team
-
Additional Information: Communications checklist
E. Human Resources team
Human Resources team
- Human Resource team director is
- Primary: DoIT HR Director
- Secondary: UW HR Director
-
Responsibilities include
- Support employees and families
- Coordinate with University Legal Services and University Benefits Services
- Expedite hiring of temporary staff as needed
-
Additional Information: Human Resources checklist
F. Administrative team
Administrative team
The Administrative team directs the administrative, logistical, and financial aspects of IT service recovery. They are directed by the Administrative Team Director and are comprised of four teams:
- The Administrative Team Director role is
- Primary: DoIT Chief Financial Officer
- Secondary: DoIT Accounting Manager
-
Responsibilities include
- Support employees and families
- Coordinate with University Legal Services and University Benefits Services
- Expedite hiring of temporary staff as needed
-
Additional Information: Administrative Team Director checklist
- Administrative team members
-
- Administrative Team Director
- Finance team
- Insurance team
- Logistics team
- Procurement team
-
G. Assigned and Unassigned Personnel Responsibilities
COOP-assigned personnel's responsibilities
Any staff assigned COOP responsibilities is on-site during a COOP activation event should first be concerned for their own safety. If they are on site and safe, COOP-assigned staff are to perform responsibilities as delegated by primary managers, once COOP has been activated. Following COOP activation, assigned staff members should report to the emergency operations location as determined by the Executive Management Team.
COOP-assigned staff will assist as requested in activities such as the handling of all essential services, the notification of all staff members regarding the situation, and the contact of any unassigned staff who are requested to provide assistance. They will be directly responsible for utilizing the COOP roles and responsibilities documentation, including DoIT personnel, unit mappings to the COOP structure as well as the supporting DoIT organizational chart and telephone list. They are also directly responsible for providing assistance as directed by higher-ranking staff members or alternates as designated by the Plan.
Unassigned personnel's responsibilities
Unassigned personnel should be prepared to support the assigned staff, if required. During non-duty hours they should remain at home and check for information or instructions every morning by contacting the DoIT Human Resources team or through other designated communication channels. If they are called in to work, staff should report to their designated location and perform any assigned duties that are appropriate for their skills and training.
IV. Appendices
These Appendices contain COOP information requiring more frequent review and updates.
A. Orders of succession
Orders of Succession
Orders of succession are formal, sequential listings of positions (rather than specific names of individuals) that identify who is authorized to assume a particular leadership or management role when the incumbent dies, resigns, is unavailable, debilitated, or is otherwise unable to perform the functions and duties of his or her position. Orders of succession provide for the orderly and predefined assumption of offices during an emergency situation requiring COOP activation. They allow for the continued operation of DoIT and its essential services and enable a rapid response.
In the absence or unavailability of the Deputy CIO, the following order of succession will determine fulfillment of the DoIT COOP Incident Commander (CIC) role :
- Secondary: a DoIT Core Services director as determined by EMT
- Tertiary: DoIT ITSM assistant director
The DoIT COOP Incident Commander (CIC) or alternative will contact the other positions in the order listed above until he/she reaches a person that is available to serve as the Deputy CIO.
The DoIT CIC has the authority to re-delegate the associated functions and activities of the role to the next in the line of succession if the successor is better equipped to serve as the DoIT CIC based on the major disruption’s nature. If the Deputy CIO is unable to serve as the DoIT CIC, the successor has the full authority that the Deputy CIO would have, which includes carrying out the functions of DoIT and the ability to allocate the entire Division's fiscal, personnel, and equipment resources.
The Deputy CIO reserves the right to place limitations on the successor's authority. They are as follows:
- The Deputy CIO places no limitations.
Once the DoIT CIC appoints another successor, the Deputy CIO is able to return to his/her position, or the CIO and Vice Provost for Information Technology assigns another successor, all authorities previously delegated will be terminated. Table 7 outlines other DoIT Order of Succession.
Area |
Position Title |
Name |
Successor(s) |
---|---|---|---|
Administration |
Chief information Officer |
Lois Brooks |
David Pagenkopf |
Administration |
Chief Financial Officer |
Sara Hart McGuinnis |
Colleen Reilly |
Administration |
Deputy Chief Executive Officer |
David Pagenkopf |
|
Advancement |
Communications Director |
Mary Evansen |
Kyle Henderson |
Human Resources |
Human Resources Director |
Adam Fermanich |
Holly Weber |
B. Delegation of authority
Delegation of authority
This Delegation of Authority ensure the orderly and predetermined transition of responsibilities. They are related to but distinct from orders of succession. Written delegation of authority provides recipients with legal authorization to act on behalf of the organizational officials and to execute specific duties within the organization. Delegations of authority are triggered when the position holding authority is not readily accessible due to travel, communications outages, sickness, or is otherwise unable to fulfill their responsibilities. In some cases, limitations such as financial restrictions may be applied.
This document provides the legal authority for officials to make key policy decisions during a COOP activation.
In the event of a major emergency disruption of IT services, the primary and alternate Emergency Management Team members and the primary and alternate Incident Response team members listed are delegated to have the necessary authority to carry out their essential services. This delegation of authority ensures:
- Continued operations of the Division and its critical services
- Rapid response to any emergency situation requiring COOP implementation
These predetermined delegations of authority will take effect when normal channels of direction are disrupted, and will terminate when normal channels are resumed.
Delegation of authority - Deputy CIO
The successor to the Deputy CIO (as determined by the Orders of Succession listed above) has the full authority that the Deputy CIO would have, which includes carrying out the functions of DoIT and the ability to allocate the entire Division's fiscal, personnel and equipment resources.
The Deputy CIO reserves the right to place limitations on the successor relating to Division expenditures.
In the event that the Deputy CIO or other key personnel are unavailable to serve as the Deputy CIO, the order of succession specified in the section above will be adhered to until a higher successor becomes available. At this point, all of the authorities previously delegated will be terminated.
If the successor is expected to become unavailable or someone else in the line of succession is better equipped to serve as the Deputy CIO based on the nature of the major problem, the successor has the authority to re-delegate the functions and activities associated with being the Deputy CIO to that person. Table 8 summarizes key authorities within DoIT
Authority |
Position Holding Authority |
Acting Agent |
Limitations |
---|---|---|---|
|
Human Resources Director |
Adam Fermanich |
Not specified |
|
Chief Financial Officer |
Sarah Hart McGuinnis |
Purchasing department rules, including use, amount limitation, etc. |
|
Chief Financial Officer |
Sarah Hart McGuinnis |
Standard operational contracting procedures |
|
UW-Madison Office of Legal Affairs, Vice Chancellor for Legal Affairs |
Nancy K. Lynch |
Not specified |
|
DoIT Purchasing & Financial Services |
Kirsten Mastalir |
Established travel restrictions and costs |
|
DoIT Directors |
Standard contractual limitations |
|
|
Director of Communications |
Mary Evansen |
Not specified |
C. Orders of Notification
Orders of notification
Initial notification procedures are critical to personnel safety and orderly response to a major disruption.
The Senior Systems and Network Control Center (SNCC) staff on duty, in consultation with the Systems Engineering & Operations (SEO) Duty Manager and the SEO Director, are responsible for following data center notification procedures as described above in Section 1.1.1 of this document.
Escalation for COOP activation
- SNCC staff notify
- appropriate technologists of disruption
- Situation Manager and leads of affected IT services
- SEO Duty Manager of disruption
- SEO Duty Manager
- Engages UWPD if not already informed
- Engages relevant SEO Oncall staff
- Engages the SEO director
- Notifies the Building Manager, if appropriate
- SEO Director
- Notifies Executive Management Team (EMT) of possible need for COOP activation
- Executive Management Team (EMT)
- Determine COOP activation level – if the DoIT COOP activation is warranted, and if so, the extent of COOP to be used by declaring a full or partial COOP activation
If an immediate danger exists within a DoIT facility, a member of the Incident Management team will immediately notify the Building Manager that an incident has occurred and action is being taken to respond to the incident. Building occupants should be directed according to hazard specific procedures outlined in the UW-Madison Police Emergency Procedures Guide. Note that certain incidents (e.g. a fire or active shooter event) may require immediate response/direction.
Staff notification
During COOP activation, DoIT personnel who can assist in resolving a major disruption should be notified via normal Escalation procedures, as per the DoIT Operational Framework Section 4 – Incident Management, section 4.4: Guidance for Escalating Incidents to Management. See extracts below for escalation guidelines.
SNCC operators will post notifications to the Outage pages, if available and applicable to the disruption, and involve appropriate technologists, the Situation Manager and SEO Duty Manager.
UW executive notifications
During COOP activation, in conjunction with DoIT Communications, the DoIT CIC approves all communication with UW-Madison campus officials and other executive authorities
Public communication
DoIT Communications coordinates UW-Madison Communications for all general public communication via websites, email, and social media about a major disruption.
Extracts from DoIT operational framework, section 4.
Elapsed time |
Priority One incident |
---|---|
10 minutes |
|
2 hours |
|
… |
…(see KB 11040 for details)… |
4.5.1. Priority One case escalation
-
- If a network component is not involved in the service outage, SNCC staff should contact the appropriate service support technologist by using contact information defined in the Configuration Management Database (CMDB)
- If a network component is involved in the service outage and SNCC staff cannot contact Op Engineering, contact appropriate network engineer and the Op Engineering manager
- Priority 1 case escalation always requires direct interactive contact (voice, chat, email with reply) with the area to which the case is being escalated. Do not assume that others can work on a case until you confirm it interactively
D. Service tiers (COOP tiers)
Service tiers (COOP tiers)
The DoIT Service Tier system, often called the DoIT COOP Tier system, is a method to categorize services managed by or hosted in DoIT data centers which informs service recovery priorities. DoIT leadership and governance groups are responsible for identifying and evaluating DoIT service recovery priorities.
Individual DoIT service providers are responsible for identifying needed capability improvements in consultation with their customers, and for requesting funding to support those improvements. DoIT leadership and governance groups are responsible for evaluating proposed improvements and for establishing appropriate prioritization, planning, and funding mechanisms to implement approved improvements.
Service tier definitions
The following service tier definitions are based on best practices established by the National Infrastructure Protection Plan (NIPP) and other guidelines, and are supported by the State of Wisconsin, University of Wisconsin System, and University of Wisconsin-Madison.
Tier |
Impact |
Definition |
---|---|---|
0 |
» Foundational » Critical Infrastructure |
» Infrastructure required for the operation of essential and other infrastructure services |
1 |
» Health » Safety » Law and Order |
» Services whose loss endangers health, safety, or orderly response to campus operations. » Includes essential, customer-facing services (1A) whose loss for >8 business hours represents a significant adverse impact. |
2 |
» Enterprise Operations » Severe Impact |
» University-wide services whose loss severely impacts campus operations. » Departmental services whose loss prevents a specific department from operating. |
3 |
» Enterprise Operations » Departmental Operations » Moderate Impact |
» University-wide services whose loss affects campus operations. » Departmental services whose loss severely affects a specific department's operations. |
4 |
» Departmental Operations » Minimal Impact |
» University-wide services whose loss has minimal impact on campus operations. » Departmental services whose loss affects a specific department's operations |
E. Campus emergency contacts
Campus emergency contacts
- Report all injuries and damage to the University Police UWPD by calling 911.
-
Be prepared to give the following:
- Your name
- Building Name
- Type of injury or damage
- The location of injured person(s) or building damage
- Room number you are calling from
-
- When reporting an emergency
- Stay on the line with the dispatcher
- Provide the address, location and a description of the emergency
- Provide the phone number at your location
- Provide a thorough description of the incident to assure appropriate resources are dispatched
- Building issues - first call UWPD
- For building related problems, you must first call the UW Police Department at 264-COPS (2677) and they will contact the University of Wisconsin Physical Plant.
- DoIT staff contact
-
The Emergency Contact Information for DoIT personnel is kept current, both in the IT Service Management tool WiscIT and in the cloud on BOX in the file 8-1-2022 DoIT - Crisis Response Contact Information (3).xlsx .
-
- Campus and community contact telephones - Table 10
-
Table 10: Emergency Campus and Community Telephone List Entity/Organization
Telephone Number
Fire/Police/Ambulance
9-1-1
University of Wisconsin Police Department Non-Emergency
608-264-COPS (2677)
Dane County Public Health Department
608-266-4225
608-255-2345
University of Wisconsin Physical Plant Customer Service – Tradesmen
608-263-3333
University of Wisconsin Safety Department
608-265-5000
Madison Gas & Electric
608-251-8300
University of Wisconsin Environment, Health and Safety Department for follow-up after biological, chemical, or radioactive hazards
608-265-5600
University of Wisconsin Health Services (UHS)
608-265-5600
Poison Control
800-222-1222
National Weather Service Milwaukee/Sullivan Office
262-965-2074
Dane County 24-hour Mental Health Crisis Line
608-280-2600
Physical Plant Customer Service
608-263-3333
DoIT Building Facility Manager
608-262-6149 / c 608-517-6732
-
F. Threat, risk, and vulnerability analysis
Threat, risk, and vulnerability analysis
This section has 3 parts.
Sections 2 and 3 are provided by Ed Lawson of UWPD in UW-Madison Hazard/Consequence Analysis, Aug 3, 2021.
- 1. Disruption and response
-
Many IT service disruptions are managed during the ordinary course of business by the DoIT Problem Management framework and accompanying procedures. Typical outages use DoIT's Help Desk procedures for creating incident and problem tickets/reports, generating notification procedures, and escalating severity levels as necessary.
IT Service Problems which potentially jeopardize staff safety, physical access to or structural integrity of the data center facility, physical integrity of equipment, or major power outages of extended duration may constitute a need to activate the Plan.
DoIT management may escalate or de-escalate the response of any disruption depending on evolving circumstances.Table 11: Example Emergency Disruptions and COOP Activation Levels Problem Type
Examples
Severity
danger/damage/durationResponse
Facility damage
Fire
Major Disruption, resulting in:
» Extensive physical damage and/or danger
» Extensive duration
» Full DoIT Continuity of Operations plan activation
» Emergency operations center activation
Flooding
Explosion
Facility access
Terrorist threat
Major Disruption, resulting in:
» Potentially extensive damage and/or danger
» Medium duration
» Campus Emergency Management activation
» Full DoIT Continuity of Operations plan alert
» Emergency communications activation
Severe weather
Train derailment
Pandemic
Power outage
Utility outage
Moderate Disruption, resulting in:
» Minimal physical damage or danger
» Medium duration
» DoIT Continuity of Operations plan standby alert
» Possible Full or Partial Activation of DoIT Continuity of Operations plan
» Emergency communications activation
Hardware or
software attackComputer virus
Minor Disruption, resulting in:
» Potentially extensive loss of services
» Medium duration
» DoIT Problem Management procedures
» Possible Partial Activation of DoIT Continuity of Operations plan
Computer hacker
Hardware or
software errorHuman error
Minor Disruption, resulting in:
» Potentially extensive loss of services
» Medium duration
» DoIT Problem Management procedures
Software malfunction
Hardware malfunction
-
- 2. Hazard chart with risk scores
-
Below is a list of hazards of concern to UW-Madison campus departments known to cause emergencies and disasters on university campuses. The hazards have basic risk scores. These hazards are compiled based on:
- City of Madison Hazard Assessment
- UW-Madison Emergency Management campus concerns development and feedback for campus departments
- Definitions and information from Ready.gov website and other government sources
- Other hazards -- not listed below – may also cause emergency situations on campus. The EOP may be activated with or without an Emergency Operations Center (EOC) for any of the following situations.
- The risk score is a function of likelihood and consequence. Likelihood is the chance that the hazard might occur. Since the risk of any hazard is dependent upon the chance that it will occur (likelihood), and the impact of an occurrence (consequence). Risk Score = Likelihood x Consequence
-
Hazard |
Likelihood |
Consequence |
Risk Score |
---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Chart terms
Likelihood – The chance that the hazard might occur. Likelihood options for this chart are:
- Very high – Can be expected to occur, 75% or greater chance of occurring.
- High – Will probably occur, 50 %– 75% chance of occurring.
- Moderate - May happen, 25% - 50% chance of occurring.
- Rare - May only occur in exceptional circumstances.
Reference: https://ehealthresearch.no/files/documents/Appendix-Definitions.pdf
-
-
Consequence – The impact of an occurrence. Consequence options for this chart are:
- Extreme – Death and life-threatening injuries, outside resources required, extreme operational disruption and widespread property damage.
- Major – Life threatening injuries, institutional resources required, major operational disruptions and severe property damage.
- Moderate – Moderate to life impacting injuries, additional resources required, significant delays in work performance and substantial property damage.
- Minor – Minor injuries, moderate impact on resources, modest delays and moderate property damage.
- Insignificant – No injuries, no impact on resources, no delays in work performance and minor property damage.
Reference: https://institute.acs.org/lab-safety/hazard-assessment/fundamentals/risk-assessment.html
Table 13: Risk score – likelihood x consequence CONSEQUENCE
LIKELIHOOD
Insignificant (1)
Minor (2)
Moderate (3)
Major (4)
Extreme (5)
Rare (1)
Low
Low
Low
Low
Low
Moderate (2)
Low
Low
Low
Medium
Medium
High (3)
Low
Low
Medium
Medium
Medium
Very High (4)
Low
Medium
Medium
High
High
Reference: www.healthcaregovernance.org.au/docs/risk-matrix.doc
-
- 3. UWPD definitions
-
Active threat/attacks in public places - Active shooter: Individuals using firearms to cause mass casualties. Individuals using a vehicle to cause mass casualties. Individuals using homemade bombs to cause mass casualties. Other methods of mass attacks may include knives, fires, drones or other weapons, civil disturbances, bomb threats, building takeovers, hostage situations, workplace violence, acts of terrorism, physical violence against self or others.
Civil disorder/mass protest - Civil disorder, also known as civil disturbance or civil unrest, is an activity arising from a mass act of civil disobedience (such as a demonstration, riot, or strike) in which the participants become hostile toward authority, and authorities incur difficulties in maintaining public safety and order, over the disorderly crowd.
Explosions - Explosive devices can be carried by cars and people and are easily detonated from remote locations or by suicide bombers. Explosions can also occur by accident, gas line leaks, boiler explosions, unmonitored research projects, etc.
Extreme cold - Exposures to extreme cold can cause frostbite and/or hypothermia and can become life threatening. When the effects of extreme cold are combined with wind, a person’s body can lose heat quickly. The wind chill temperature index is based on the rate of heat loss from exposed skin by combined effects of wind and cold. As the wind increases, heat is carried away from the body at an accelerated rate, driving down the body temperature. When the wind chill is lower than -20°, frostbite can occur in approximately 30 minutes. As the wind chill decreases, frostbite can set in more quickly. About 50% of extreme cold injuries happen to people over 60 years old. More than 75% of injuries happen to males.
Extreme heat - Extreme heat is a period of high heat and humidity with heat indices above 90°. Extreme heat causes your body to work extra hard to maintain a normal temperature, which can first lead to heat exhaustion, then to heat stroke, and eventually can lead to death without intervention. In fact, extreme heat is responsible for the highest number of annual deaths among all weather-related hazards. Heat indices over 105° are dangerous and over 125° are extremely dangerous.
Facility failure - Elevator malfunction, utility service interruption (gas, water, electricity, sewer, heat). Building collapse, bleacher collapse, window breakage, damage to buildings.
Fire - Fire is fast and can become a major fire in less than 30 seconds. It only takes minutes for thick black smoke to fill a house or for it to be engulfed in flames. The heat is more threatening than flames. Room temperatures in a fire can be 100 degrees at floor level and rise to 600 degrees at eye level. Inhaling this super-hot air will scorch your lungs and melt clothes to your skin. It starts bright, but quickly produces black smoke and complete darkness. Smoke and toxic gases kill more people than flames do. Fire produces poisonous gases that make you disoriented and drowsy.
Flood/rain event - Flooding is a temporary overflow of water onto normally dry land. Floods are the most common natural disaster in the United States. Failing to evacuate flooded areas or entering flood waters can lead to injury or death. Six inches of flood water will reach the bottoms of most cars, causing the loss of control of the vehicle or possibly stalling. A foot of water will float a car. Two feet of water can carry away most vehicles. Floodings may result from rain, snow, coastal storms, storm surges and overflows of dams and other water systems. Flood can develop slowly or quickly. Flash floods can come with no warning, cause outages, disrupt transportation, damage buildings and create landslides.
Hazardous material incidents - Hazardous materials can include explosives, flammable and combustible substances, poisons and radioactive materials. Emergencies can happen during research, production, storage, transportation, use or disposal. You are at risk when chemicals are used unsafely or released in harmful amounts where you live, work or play.
Health emergency/epidemic/pandemic There will be times where health emergencies are limited to countries. In these cases, they are identified as epidemics. A pandemic is a disease outbreak that spans several countries and affects a large number of people. Pandemics are most often caused by viruses that can easily spread from person to person. It is hard to predict when or where the next new pandemic will emerge.
IT outage – Damage or denial of service to communication, radio, television, computer or other University, or affiliated technologies; cyber invasion.
Mass casualty - Refers to any large number of casualties produced in a relatively short period of time, usually as the result of a single incident such as an aircraft accident, hurricane, flood, earthquake, or a violent attack that exceeds local logistic support capabilities.
Power outages - A power outage is when the electrical power goes out unexpectedly. A power outage may: Disrupt communications, water and transportation, close retail businesses, grocery stores, gas stations, ATMs, banks and other services, cause food spoilage and water contamination, and prevent use of medical devices. Extended power outages may impact the whole community and the economy.
Severe winter snow/ice storm – Severe winter storms produce excessive amounts of winter precipitation (i.e., snow, freezing rain, sleet) that cause dangerous impacts. Winter storms create a higher risk of car accidents, hypothermia, frostbite, carbon monoxide poisoning, and heart attacks from overexertion. Winter storms including blizzards can bring extreme cold, freezing rain, snow, ice and high winds.
Thunderstorms/tornadoes/lightning - Severe thunderstorms are dangerous storms that include lightning and can produce extreme winds, tornadoes, and flash flooding. A severe thunderstorm occurs when wind speeds from the thunderstorm exceed 58 mph. In Wisconsin, this is the most common type of severe weather from a thunderstorm. Thunderstorms can also produce tornadoes. Tornadoes are violently rotating columns of air that extend from a thunderstorm to the ground. Tornadoes can destroy buildings, flip cars, and create deadly flying debris. A tornado can happen anytime and anywhere, bring intense winds (over 200 miles per hour). Severe lightning can also accompany thunderstorms. Lightning is a leading cause of injury and death from weather-related hazards. Although most lightning victims survive, people struck by lightning often report a variety of long-term, debilitating symptoms.
-
G. DoIT locations
DoIT locations
DoIT locations include office spaces and data center spaces.
Office spaces
» Computer Sciences building, 1210 West Dayton Street
» Rust-Schreiner building, 111 N. Orchard St. and 115 N. Orchard St.
» Middleton Building, 1305 Linden St,
» 634 W Main St for ESB staff
» 660 West Washington Ave
» 700 Regent St leased office space
» 2109 S. Stoughton Road, Digital Printing and Publishing Services (DPPS) facility
» WARF offices, formerly AIMS
Data Center and Equipment spaces
» One Neck IT Solutions, 5515 Nobel Dr, Fitchburg, WI 53711 leased space,
» Medical Foundation Centennial Building (MFCB), 1685 Highland Dr
» Wisconsin Alumni Research Foundation building, 610 Walnut St (end of use expected after Feb 2024)
» DOA-DET leased space, 5830 Femrite Dr, Madison 53718
» Materials Distribution Services (MDS), 1061 Thousand Oaks Trail, Verona
» Network nodes at 14 locations and Network supernodes at three locations
- 1210 West Dayton St, Computer Science & Statistics building
- 432 North Murray St - East Campus Mall
- 1675 Observatory Dr, Animal Sciences building
Building Maps
Maps and floor plans are located in each building’s Occupant Emergency Plan (OEP) which is stored in two locations, by
» the Building Manager, and
» UW Police Department Emergency Management, in Smartsheet
H. Critical websites
Critical websites
This list of critical websites is also located in the Box folder entitled “wiscit”.
Critical and important websites - COVID
Last updated 3/18/2020 CONFIDENTIAL
NOTE: [per the DoIT Web Platform Services team] we strongly recommend all critical information be shared on the sites marked at "Critical Infrastructure - priority 1." These sites are architected to be resilient to high volumes of traffic. Other sites will receive second or third priority for staff focus and may have more risk of outages. |
||||||
---|---|---|---|---|---|---|
Name |
Category |
Category defined by |
Who manages (site admin / owner) |
Where hosted |
Action Items |
Recommend- ations/notes |
alerts.wisc.edu |
Critical Infrastructure - priority 1 |
UMark/UComm |
University Marketing |
AWS UMark Account, high-availability |
No Action Needed |
|
Admin for alerts.wisc.edu |
Critical Infrastructure - priority 1 |
UMark/UComm |
University Marketing |
AWS UMark Account, high-availability |
No Action Needed |
|
Canvas |
Critical Infrastructure - priority 1 |
UMark/UComm |
Academic Technology |
|
|
|
covid19.wisc.edu |
Critical Infrastructure - priority 1 |
UMark/UComm |
University Marketing |
AWS UMark Account, high-availability |
No Action Needed |
|
instructionalcont inuity.wisc.edu |
Critical Infrastructure - priority 1 |
UMark/UComm |
Academic Technology |
WiscWeb |
WPS moving to HA |
This is not on a high availability server. Initial site launched on WiscWeb to be available for Chancellor announcement. Full WPS staff is available to mitigate any potential issues as needed. WPS, AT, and UMark now establishing plans to move to a high-availability server. |
kb.wisc.edu |
Critical Infrastructure - priority 1 |
DoIT Comm |
Web Platforms/Services |
KB high-availability servers |
No Action Needed |
|
uhs.wisc.edu/ coronavirus-2019 |
Critical Infrastructure - priority 1 |
UMark/UComm |
University Marketing |
REDIRECT in place to covid19. wisc.edu |
No Action Needed |
UHS site content has been added to main COVID site and redirect in place. |
wisc.edu |
Critical Infrastructure - priority 1 |
UMark/UComm |
University Marketing |
AWS UMark Account, high-availability |
No Action Needed |
|
doitnet.doit.wisc.edu |
Non-critical; high visibility - priority 2 |
|
DoIT Comm |
Web Hosting |
Check with owner |
This is not on a high-availability server. Consult with application admins about service expectations |
housing.wisc.edu |
Non-critical; high visibility - priority 2 |
UMark/UComm |
University Housing |
WiscWeb |
Check with owner |
This is not on a high-availability server. Consult with application admins about service expectations |
it.wisc.edu |
Non-critical; high visibility - priority 2 |
|
DoIT Comm |
Web Hosting |
Check with owner |
This is not on a high-availability server. Consult with application admins about service expectations |
my.wisc.edu |
Non-critical; high visibility - priority 2 |
UMark/UComm |
Web Platforms/Services |
MyUW high-availability servers |
No Action Needed |
|
outages.doit.wisc.edu |
Non-critical; high visibility - priority 2 |
DoIT Comm |
DoIT Service Management |
Dedicated VM (but not High Availability) |
Check with owner |
This is on a dedicated VM, but is not on a highavailability server. Consult with application admins about service expectations |
parent.wisc.edu |
Non-critical; high visibility - priority 2 |
UMark/UComm |
University Marketing / Campus & Visitor Relations |
UMark single server: HENRY |
No Action Needed |
May need to contact Umark to find out if they were planning on makng this HA |
registrar.wisc.edu |
Non-critical; high visibility - priority 2 |
UMark/UComm |
Registrar's Office |
WiscWeb |
Check with owner |
This is not on a high-availability server. Consult with application admins about service expectations |
uhs.wisc.edu |
Non-critical; high visibility - priority 2 |
UMark/UComm |
University Marketing / UHS |
Web Hosting |
Check with owner |
This is not on a high-availability server. Consult with application admins about service expectations |
uwpd.wisc.edu |
Non-critical; high visibility - priority 2 |
UMark/UComm |
University Marketing / UWPD |
Web Hosting |
Check with owner |
This is not on a high-availability server. Consult with application admins about service expectations |
admissions.wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - spring admissions |
|
WiscWeb |
Monitor - check with owner if concerns emerge |
|
advising.wisc.edu |
Non-critical; high visibility - priority 3 |
Sent in TechNews |
|
Web Hosting |
Monitor - check with owner if concerns emerge |
|
apps.admissions.wisc. edu/visitbuck |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
Web Hosting |
Monitor - check with owner if concerns emerge |
|
asm.wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
|
Monitor - check with owner if concerns emerge |
|
businessservices. wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
WiscWeb |
Monitor - check with owner if concerns emerge |
|
compliance.wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
WiscWeb |
Monitor - check with owner if concerns emerge |
|
doso.students. wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
WiscWeb |
Monitor - check with owner if concerns emerge |
|
ehs.wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
WiscWeb |
Monitor - check with owner if concerns emerge |
|
hr.wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
Web Hosting |
Monitor - check with owner if concerns emerge |
|
internationaltravel. wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
WiscWeb |
Monitor - check with owner if concerns emerge |
|
iss.wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
WiscWeb |
Monitor - check with owner if concerns emerge |
|
mynetid.wisc.edu /activate |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - spring admissions |
DoIT IAM |
DoIT - team unspecified |
Monitor - check with owner if concerns emerge |
Required step for applicants and accepted admits |
ohr.wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
Web Hosting |
Monitor - check with owner if concerns emerge |
|
studyabroad. wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
WiscWeb |
Monitor - check with owner if concerns emerge |
|
union.wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
Web Hosting |
Monitor - check with owner if concerns emerge |
|
vote.wisc.edu |
Non-critical; high visibility - priority 3 |
WPS proactive monitoring - linked from covid19.wisc.edu |
|
DoIT - Linux team |
Monitor - check with owner if concerns emerge |
|
V. Checklists and record templates
Checklists referenced in the DoIT Continuity of Operations (COOP) Plan:
Record templates
A. COOP After-Action worksheet
COOP After-Action worksheet
Use this worksheet as a guide for completing the After-Action Report. Modify or add steps as the situation warrants.
|
|
|||
LEAD: ___________________________ |
|||
Assigned |
Tasks / Decisions |
Completed |
|
|
Assemble group to review effectiveness of COOP plan and operations
What feedback did we get from people and agencies we worked with? |
|
|
|
Distribute AAR to Executive Management Team |
|
|
|
If needed, schedule training for staff |
|
|
|
Revise COOP Manual to incorporate changes |
|
|
|
Write After-Action Report (AAR) of COOP activation
Decisions and actions taken by CIC |
|
B. COOP deactivation checklist
COOP deactivation process checklist
In coordination with the Emergency Management Team, the Incident Commander will determine when to deactivate the COOP plan.
Date/Time |
Decisions/Tasks |
Assigned to |
---|---|---|
|
Develop a communications plan to inform all appropriate parties of the COOP deactivation.
|
|
|
Inform the reconstitution manager that the department will be returning to normal operations. The reconstitution manager will work with the reconstitution team to facilitate the process. |
|
|
If necessary, assign a relocation team to begin the process of moving from the alternate site to the permanent facility. |
|
C. COOP log of contacts made
COOP log of contacts made
Log of contacts made |
||||
---|---|---|---|---|
Group = ____________ |
||||
Name |
Reason for call (notification, update work schedule, etc.) |
Contact Method (phone #, email address, etc.) |
Date/Time Contacted |
Date/Time Call Returned |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D. Reconstitution process log
Reconstitution process log
Use this log to track tasks performed during the reconstitution of DoIT Staff and facilities, after the initial response to a severe incident.
Reconstitution process log |
|||
---|---|---|---|
Date/Time |
Tasks |
Recovery Time Objective (RTO) |
Assigned to |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
E. COOP family readiness resources
COOP family readiness resources
Here are some useful resources available online, which can be provided to staff:
- Ready - a US government site
- Red Cross preparedness planning
- State Farm - Teaching Your Kids Emergency Preparedness
Executive Management Team
Details for the roles on the Executive Management Team.
F. DoIT COOP Incident Commander (CIC) detail
G. Executive Management Team checklist detail
Executive Management Team checklist detail - also see KB 97141
Executive Management Team checklist detail
Tasks |
Point of contact |
Date/Time completed |
---|---|---|
Phase 1: Initial notification and response |
||
Executive Management Team initiates recovery procedures |
||
Establishes the DoIT Emergency Operations Center (EOC) |
|
|
Declares an incident, if warranted |
|
|
Determines where recovery will take place. |
|
|
Initiates communications procedures. |
|
|
Activates Incident Response teams and ensures teams are properly staffed. See DoIT COOP, Section 3 - Roles and Responsibilities |
|
|
Phase 2: Assessment and activation |
||
Executive Management Team receives the following reports from IT service recovery teams, and provides guidance and decisions as needed. |
||
Initial damage assessments |
|
|
|
|
|
|
|
|
|
|
|
Detailed damage assessments |
|
|
|
|
|
|
|
|
Recovery estimates |
|
|
|
|
|
|
|
|
Additional comments:
|
|
|
Phase 3: Recovery |
||
Executive Management Team manages ongoing prioritization of service recovery and resolution of issues. |
||
Additional comments:
|
|
|
H. Communications consultant checklist detail
Communications consultant checklist detail - also see KB 97318
Communications consultant checklist detail
Tasks |
Point of contact |
Date/Time completed |
---|---|---|
Phase 1: Initial notification and response |
||
Communications consultant team manages communications with campus |
||
|
|
|
|
|
|
|
|
|
|
||
|
|
|
Communications consultant team and Help Desk team coordinate messages for in-bound and out-bound contacts. |
||
Phase 2: Assessment and activation |
||
Communications consultant team manages ongoing communications with campus Operations team, Help Desk team and Communications consultant team. |
||
|
|
|
|
|
|
Additional comments:
|
|
|
Phase 3: Recovery |
||
Communications consultant team manages ongoing communications with campus: |
||
|
||
|
||
|
||
Additional comments:
|
|
|
Ongoing |
||
Communications Consultant is part of the Executive Management Team. See Executive Management Team checklist for details. |
I. Human Resources checklist detail
Human Resources checklist detail - also see KB 97144
Human Resources checklist detail
Tasks |
Point of contact |
Date/Time completed |
---|---|---|
Phase 1: Initial notification and response |
||
Incident Response team leaders prepare teams |
||
|
|
|
|
|
|
|
|
|
|
|
|
|
||
Human Resources team coordinates communication with staff, families, campus and state HR offices, and employee representatives. |
||
|
||
Additional comments |
||
Phase 2: Assessment and activation |
||
Human Resources team coordinates safety and support for staff. |
||
|
|
|
|
|
|
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Additional comments:
|
|
|
Phase 3: Recovery |
||
Continued: Human Resources team coordinates safety and support for staff. |
||
Additional comments:
|
|
|
Ongoing
|
||
Human Resources team lead is part of the Executive Management Team. See Executive Management checklist for details. |
IT Service Recovery team
Details for the roles on the IT Service Recovery team.
J. IT Service Recovery Director checklist detail
IT Service Recovery Director checklist detail - also see KB 97328
Tasks |
Point of contact |
Date/Time completed |
---|---|---|
Phase 1: Initial notification and response |
||
IT Service Recovery Director oversees the SNCC and SEO initial response to incident. |
||
SNCC follows standard procedures |
||
|
|
|
|
|
|
|
||
SEO follows procedures |
||
|
||
IT Service Recovery Director ensures orderly communications among IT service recovery teams throughout the recovery process. |
||
Facility team instructions |
||
|
||
Phase 2: Assessment and activation |
||
IT Service Recovery Director identifies possible security risks to be assessed by the Security Consultant. |
||
IT Service Recovery Director oversees Service Recovery teams during initial damage assessment process: |
||
Facility team instruction |
||
Secures the facility for assessment by emergency services.
|
|
|
Conducts preliminary damage assessment of primary data center structure and utility equipment.
|
||
Makes a preliminary estimate of replacement and repair time including ordering, shipping, installation, and testing. |
||
Avoids further damage to the site or equipment and ensures Risk Management has noted damage to equipment before it is moved. |
||
Reports initial damage assessment findings to Executive Management Team within two hours of the declared disaster, and estimates time to complete detailed assessment. |
||
Security consultant / team instructions |
||
Assesses facility access needs and restrictions in consultation with Facility team:
|
||
Responds to urgent security incidents, as necessary. | ||
Facilitates IT security risk assessments on issues identified by the IT Service Recovery Team Director. | ||
Hardware, Network and Infrastructure teams conduct preliminary damage assessment of network and infrastructure equipment. |
||
Hardware/network/infrastructure instructions |
||
|
||
|
||
|
||
|
||
IT Service Recovery Director oversees service recovery teams during detailed damage assessment process: |
||
Facility team instructions |
||
Facility team conducts detailed assessment of primary data center structure and utility equipment:
|
||
Hardware team instructions |
||
Hardware team conducts detailed assessment of network and infrastructure equipment:
|
||
Security team instructions |
||
Security Consultant:
|
||
Network/infrastructure/application services teams instructions |
||
|
||
Operations/Help Desk instructions |
||
|
||
Additional comments:
|
|
|
Phase 3: Recovery |
||
IT Service Recovery Director oversees service recovery teams during the recovery process: |
||
Facility team instructions |
||
Facility team prepares the recovery site:
|
||
Security team instructions |
||
Security consultant:
|
||
IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site: |
||
Details
|
||
Infrastructure team: |
||
|
||
Application services team: |
||
|
||
Hardware team: |
||
|
||
Operations team: |
||
|
||
Help Desk team: |
||
|
||
Application Services, Network and Infrastructure teams work with Hardware team to complete hardware installation. |
||
Hardware team oversees installation of hardware: |
||
|
||
Application Services, Network and Infrastructure teams complete recovery activities according to individual service component recovery plans, which are maintained separately. |
||
|
||
Operations team supports IT service recovery teams in restoring operating systems, applications and data. | ||
|
||
Help Desk team: |
||
|
||
Additional comments:
|
|
|
Ongoing |
||
IT Service Recovery Director reports to and is a part of the Executive Management Team. See Executive Management checklist for details. |
K. Application Services checklist detail
Application Services checklist detail - also see KB 97623
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial Notification and Response |
||
|
|
|
|
Assemble teams at suitable locations. |
|
|
Report situation. |
|
|
Review team responsibilities and functions. |
|
|
Prioritize and direct next actions. |
|
|
Additional comments:
|
|
Phase 2: Assessment and activation |
||
|
Network, Infrastructure and Application Services teams estimate service recovery time and report findings to Executive Management Team |
|
|
Additional comments:
|
|
Phase 3: Recovery IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site: |
||
Application Services team: |
||
|
Manages priorities for service recovery as directed by Executive Management Team. |
|
|
Recovers applications and data according to detailed recovery plans, which are maintained separately. Documentation may include:
|
|
|
Validates and supports active IT services and data hosted at the recovery site. |
|
|
Identifies any hardware missing from pre-established service components hosted at the recovery site.
|
|
|
Performs integration testing in cooperation with the Network and Infrastructure teams. |
|
|
Coordinates customer acceptance testing and troubleshooting for services and data. |
|
|
Tests printing services recovery. |
|
|
Tests recovered systems for security vulnerabilities in cooperation with Security Consultant. |
|
|
Coordinates other recovery activities with Network, Infrastructure and Hardware teams as needed. |
|
|
Communicates progress and customer-related issues to the IT Service Recovery Team Director, who conveys messages to the Communications Consultant for appropriate dissemination. |
|
|
Identifies and immediately escalates issues to Executive Management Team for decisions. |
|
IT service recovery teams restore all other levels of service Application Services, Network and Infrastructure teams work with Hardware team to complete hardware installation. |
||
|
Hardware team oversees installation of hardware:
|
|
Application Services, Network and Infrastructure teams complete recovery activities according to individual service component recovery plans, which are maintained separately. Work with Operations team to return services to fully operational status. |
||
|
Operations team supports IT service recovery teams in restoring operating systems, applications and data.
|
|
|
Additional comments:
|
|
L. Facility checklist detail
Facility checklist detail - also see KB 97143
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial Notification and Response |
||
Incident response team leaders prepare teams |
||
|
||
Assemble teams at suitable locations. |
||
Report situation. |
||
Review team responsibilities and functions. |
||
Prioritize and direct next actions. |
||
Facility team manages communications with emergency services, Physical Plant and other facility recovery personnel. |
||
Additional comments:
|
||
Phase 2: Assessment and activation |
||
Facility team secures the facility for assessment by emergency services. |
||
Works with the following groups to ensure physical safety and building security, prior to further damage assessment by DoIT personnel:
|
||
Plans adjustments to facility access needs and restrictions during damage assessment with the Security Consultant.
|
||
Facility team conducts preliminary damage assessment of primary data center structure and utility equipment. |
||
Works with the following team leads and support personnel to review and coordinate damage assessment process:
|
||
Makes a preliminary estimate of replacement and repair time including ordering, shipping, installation, and testing. |
||
Avoids further damage to the site or equipment and ensures Risk Management has noted damage to equipment before it is moved |
||
Reports initial damage assessment findings to Executive Management Team within two hours of the declared disaster, and estimates time to complete detailed assessment. |
||
Facility team conducts detailed assessment of primary data center structure and utility equipment |
||
Uses facility damage checklists generated from the Configuration Management Data Base (CMDB) and the Facility Planning & Management facility database. |
||
Avoids further damage to the site(s) and utility equipment. Ensures Risk Management has noted damage to equipment before it is moved. |
||
Assesses structural damage to data centers. |
||
Conducts assessment of individual equipment components to determine which items will be used or salvaged. Consults with Procurement team as necessary. |
||
Estimates replacement and repair time including ordering, shipping, installing and testing equipment |
||
Tags all usable equipment. |
||
Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company |
||
Reports findings to Executive Management Team. |
||
Sends detailed damage assessment lists to Procurement team. |
||
Additional comments:
|
||
Phase 3: Recovery |
||
Facility team prepares the recovery site |
||
Ensures adequate protection for equipment at the site before beginning any structural modifications. |
||
Contacts Physical Plant to begin any necessary modifications. |
||
Informs primary data center and recovery site Building Managers when work will be started. |
||
Ensures that the recovery site is suitable for basic enterprise hosting prior to the delivery of equipment: |
||
Verifies baseline levels of power, cooling, security, and structural safety |
||
Coordinates ongoing facility access needs and restrictions during recovery with the Security Consultant. |
||
Additional comments: |
M. Hardware checklist detail
Hardware checklist detail - also see KB 97320
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial Notification and Response |
||
Incident Response Team Leaders prepare teams |
||
|
|
|
|
Assemble teams at suitable locations. |
|
|
Report situation. |
|
|
Review team responsibilities and functions. |
|
|
Prioritize and direct next actions. |
|
Phase 2: Assessment and activation |
||
Hardware, Network and Infrastructure teams conduct preliminary damage assessment of network and infrastructure equipment. |
||
|
Coordinate damage assessment with facility Building Manager, University Risk Management and vendor personnel as needed.
|
|
|
Use hardware damage checklists generated from the Configuration Management Data Base (CMDB) and the Network Change Management System (CMS). |
|
|
Report initial damage assessment findings to Executive Management Team within two hours of the declared disaster |
|
|
Hardware team estimates time to complete detailed assessment |
|
Hardware team conducts detailed assessment of network and infrastructure equipment: |
||
|
Conducts assessment of individual service components to determine which items will be used or salvaged. Consults with Network, Infrastructure and Procurement teams as necessary |
|
|
Estimates replacement and repair time including ordering, shipping, installing and testing equipment. |
|
|
Tags all usable equipment. |
|
|
Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company |
|
|
Reports findings to Executive Management Team |
|
|
Sends detailed damage assessment lists to Procurement team. |
|
|
Additional comments:
|
|
Phase 3: Recovery |
||
IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site: |
||
Hardware team |
||
|
Receives critical hardware requests from IT service recovery teams and expedites provisioning of replacement hardware. |
|
|
Works closely with Procurement team on all hardware purchases. |
|
|
Identifies and immediately escalates issues to Executive Management Team for decisions. |
|
IT service recovery teams restore all other levels of service: |
||
Hardware team |
||
|
Communicates hardware requirements to Procurement team. |
|
|
Contacts vendors for repair assistance. Oversees vendor diagnostics on equipment to ensure full reliability prior to installation at the recovery site. |
|
|
Coordinates installation of hardware equipment at recovery site. |
|
|
Works with Network, infrastructure and Application Services teams to ensure all utility and access requirements are met and that cable connectors are correct. |
|
|
Additional comments:
|
|
N. Infrastructure checklist detail
Infrastructure checklist detail - also see KB 97322
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial Notification and Response |
||
|
|
|
|
Assemble teams at suitable locations. |
|
|
Report situation. |
|
|
Review team responsibilities and functions. |
|
|
Prioritize and direct next actions. |
|
|
Additional comments:
|
|
Phase 2: Assessment and activation |
||
Hardware, Network and Infrastructure teams conduct preliminary damage assessment of network and infrastructure equipment. |
||
|
Coordinate damage assessment with facility Building Manager, University Risk Management and vendor personnel as needed.
Note: Utility equipment is a Facility team responsibility; supplies and furniture are a Logistics team responsibility. |
|
|
Use hardware damage checklists generated from the Configuration Management Data Base (CMDB) and the Network Change Management System (CMS). |
|
|
Report initial damage assessment findings to Executive Management Team within two hours of the declared disaster. |
|
|
Hardware team estimates time to complete detailed assessment. |
|
Hardware team conducts detailed assessment of network and infrastructure equipment: |
||
|
Conducts assessment of individual service components to determine which items will be used or salvaged. Consults with Network, Infrastructure and Procurement teams as necessary |
|
|
Estimates replacement and repair time including ordering, shipping, installing and testing equipment. |
|
|
Tags all usable equipment. |
|
|
Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company |
|
|
Reports findings to Executive Management Team |
|
|
Sends detailed damage assessment lists to Procurement team. |
|
Network, Infrastructure and Application Services teams estimate service recovery time and report findings to Executive Management Team |
||
|
Additional comments:
|
|
Phase 3: Recovery IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site: |
||
Infrastructure team: |
||
|
Manages priorities for service recovery as directed by Executive Management Team. |
|
|
Recovers applications and data according to detailed recovery plans, which are maintained separately. Documentation may include:
|
|
|
Validates and supports active IT services and data hosted at the recovery site. |
|
|
Identifies any hardware missing from pre-established service components hosted at the recovery site.
|
|
|
Recovers security controls in consultation with the security consultant
|
|
|
Performs integration testing with the Network team. |
|
|
Communicates progress and customer-related issues to the IT Service Recovery Team Director, who conveys messages to the Communications Consultant for appropriate dissemination. |
|
|
Identifies and immediately escalates issues to Executive Management Team for decisions. |
|
IT service recovery teams restore all other levels of service Application Services, Network and Infrastructure teams work with Hardware team to complete hardware installation. |
||
|
Hardware team oversees installation of hardware:
|
|
Application Services, Network and Infrastructure teams complete recovery activities according to individual service component recovery plans, which are maintained separately. Work with Operations team to return services to fully operational status. |
||
|
Operations team supports IT service recovery teams in restoring operating systems, applications and data.
|
|
|
Additional comments:
|
|
O. Network checklist detail
Network checklist detail - also see KB 97337
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial Notification and Response |
||
|
||
Assemble teams at suitable locations. |
||
Report situation. |
||
Review team responsibilities and functions. |
||
Prioritize and direct next actions. |
||
If necessary, reference the Network-specific COOP Plan in the DR folder “wiscit” on BOX |
||
Additional comments: |
||
Phase 2: Assessment and activation |
||
Hardware, Network, and Infrastructure teams conduct preliminary damage assessment of network and infrastructure equipment. |
||
Coordinate damage assessment with facility Building Manager, University Risk Management and vendor personnel as needed.
Note: Utility equipment is a Facility team responsibility; supplies and furniture are a Logistics team responsibility. |
||
|
||
Report initial damage assessment findings to Executive Management Team within two hours of the declared disaster. |
||
Hardware team conducts detailed assessment of network and infrastructure equipment. |
||
Conducts assessment of individual service components to determine which items will be used or salvaged. Consults with Network, Infrastructure and Procurement teams as necessary. |
||
Estimates replacement and repair time including ordering, shipping, installing and testing equipment. |
||
Tags all usable equipment. |
||
Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company. |
||
Reports findings to Executive Management Team |
||
Sends detailed damage assessment lists to Procurement team. |
||
Network, Infrastructure and Application Services teams estimate service recovery time and report findings to Executive Management Team. |
||
Additional comments:
|
||
Phase 3: Recovery IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site: |
||
IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site: |
||
Network team |
||
Manages priorities for network recovery as directed by the Executive Management Team. |
||
Manages recovery of network components (switches, routers, network and security configurations) according to detailed service recovery plans. This documentation is maintained separately, and may include:
|
||
Identifies any hardware missing from pre-established service components hosted at the recovery site.
|
||
Coordinates with vendors for recovery of circuits and remote site connectivity. |
||
Performs integration testing with Infrastructure and Application Services teams. |
||
Works with the security consultant to test recovered systems for security vulnerabilities. |
||
Communicates progress and customer-related issues to the IT Service Recovery Team Director, who conveys messages to the communications consultant team for appropriate dissemination. |
||
Identifies and immediately escalates issues to Executive Management Team for decisions. |
||
IT service recovery teams restore all other levels of service Application Services, Network and Infrastructure teams work with Hardware team to complete hardware installation. |
||
Hardware team oversees installation of hardware:
|
||
Application Services, Network and Infrastructure teams complete recovery activities according to individual service component recovery plans, which are maintained separately. Work with Operations team to return services to fully operational status. |
||
Operations team supports IT service recovery teams in restoring operating systems, applications and data.
|
||
Additional comments:
|
P. Operations checklist detail
Operations checklist detail - also see KB 97363
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial Notification and Response |
||
SNCC follows Standard Procedures Senior Systems and Network Control Center (SNCC) Staff Member on duty responds to the incident: |
||
|
Detects incident directly or via monitoring device. |
|
|
Assesses extent of incident. · If emergency services are required:
|
|
|
Manages emergency shutdown of services if possible. |
|
|
Contacts SEO Duty Manager with a situation report. |
|
SEO Duty Manager: |
||
|
Accounts for all personnel who were on site during any incident jeopardizing human safety, and notifies Human Resources. |
|
|
Follows "SEO Vertical Escalation and Notification" procedure and engages responsible Situation Manager per KB #3632 (SEO Internal KnowledgeBase). |
|
|
Engages SEO On-Call Technologists per KB #80924 (SEO Internal KnowledgeBase). |
|
SEO Duty Manager and SEO Duty Technologist consult with SEO Director, if possible. |
||
|
Incident response team leaders prepare teams: |
|
|
|
|
|
Assemble teams at suitable locations. |
|
|
Report situation. |
|
|
Review team responsibilities and functions. |
|
|
Prioritize and direct next actions. |
|
|
Additional comments:
|
|
Phase 2: Assessment and activation |
||
Operations team, Help Desk team and communications consultant team: |
||
|
Assess options for handling volume calls and coordinate solution. |
|
|
Coordinate ongoing communications to various stakeholders. |
|
|
Additional comments
|
|
Phase 3: Recovery |
||
IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site:
Operations team: |
||
|
Operates equipment. |
|
|
Supports handling/redirection of incoming customer calls to the SNCC. |
|
|
Identifies and immediately escalates issues to Executive Management Team for decisions. |
|
IT service recovery teams restore all other levels of service: Operations team supports IT service recovery teams in restoring operating systems, applications and data. |
||
|
Starts and operates recovered hardware. |
|
|
Supports handling/redirection of incoming customer calls to the SNCC. |
|
|
Operations team leader oversees startup according to priorities set by Executive Management Team:
|
|
|
Additional comments
|
|
Q. Help Desk checklist detail
Help Desk checklist detail - also see KB 97321
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial notification and response |
||
Incident response team leaders prepare teams: |
||
|
|
|
|
Assemble teams at suitable locations. |
|
|
Report situation. |
|
|
Review team responsibilities and functions. |
|
|
Prioritize and direct next actions. |
|
|
If deemed necessary, reference the Network-specific COOP Plan in the DR folder “wiscit” on Box |
|
Communications consultant team and Help Desk team coordinate messages for in-bound and out-bound contacts. |
||
|
Additional comments:
|
|
Phase 2: Assessment and activation |
||
Operations team, Help Desk team and communications consultant team: |
||
|
Assess options for handling volume calls and coordinate solution. |
|
|
Coordinate ongoing communications to various stakeholders. |
|
|
Additional comments:
|
|
Phase 3: Recovery IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site: |
||
IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site: |
||
Help Desk team: |
||
|
Provides official messages from communications consultant team to customers. |
|
|
Reports incident level and severity on Help Desk (DoIT) and UW-Madison IT Outage pages. |
|
IT service recovery teams restore all other levels of service: Help Desk team: |
||
|
Provides official messages from communications consultant to customers. |
|
|
Provides ongoing updates on incident level and severity on Help Desk (DoIT) and UW-Madison IT Outage pages |
|
|
Restores IT services for the DoIT Help Desk |
|
|
As needed, provides alternate Help Desk support according to Help Desk recovery plan, which is maintained separately.
|
|
|
Additional comments
|
|
R. Physical Security Consultant checklist detail
Physical Security Consultant checklist detail - also see KB 97376
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial Notification and Response |
||
|
|
|
|
Assemble teams at suitable locations. |
|
|
Report situation. |
|
|
Review team responsibilities and functions. |
|
|
Prioritize and direct next actions. |
|
|
Additional comments:
|
|
Phase 2: Assessment and activation |
||
Security Consultant: |
||
|
Assesses facility access needs and restrictions in consultation with Facility team:
|
|
|
Responds to urgent security incidents, as necessary. |
|
|
Facilitates IT security risk assessments on issues identified by the IT Service Recovery Team Director. |
|
Phase 3: Recovery |
||
Security Consultant: |
||
|
Oversees ongoing implementation of facility access requirements in consultation with Facility team:
|
|
|
Investigates IT security incidents that caused the disaster or are preventing the recovery of systems and services. |
|
|
Works with IT service recovery teams to recover centralized security tools and resources.
|
|
|
Recovers network security configurations (e.g. CSA). |
|
|
Provides authorization services for applications as needed. |
|
|
Additional comments
|
|
Ongoing: |
||
Security consultant is part of the Executive Management Team. See Executive Management checklist for details. |
||
|
Additional comments
|
|
S. Cybersecurity Incident Response Procedures
See Cybersecurity Incident Response Procedures - authorization required for access.
Administrative team
Details for the roles on the Administrative team.
T. Administrative Director checklist detail
Administrative Director checklist detail - also see KB 99675
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial notification and response |
||
During initial response, SNCC Manager or SEO Director contacts and assembles Team Leaders and Executive Management Team. SNCC NOC Manager or SEO Director activates crisis management procedures: |
||
|
Assembles Executive Management Team members at emergency operations center.
|
|
|
Contacts Building Manager at recovery site for appropriate access to the building.
Assesses extent of incident.
|
|
|
Activates damage assessment procedures (see under Phase 2 - Assessment and activation). |
|
|
Activates Human Resources team to support staff and families. |
|
|
Notifies Insurance team lead. |
|
Incident response teams communicate with stakeholders: |
||
|
Insurance team handles communications with campus Risk Management. |
|
|
Procurement team handles communication with vendors and University/State procurement offices. |
|
|
Additional comments:
|
|
Phase 2: Assessment and activation |
||
Administrative Director oversees Administrative teams during damage assessment process: Logistics team assesses damage to furniture and supplies: |
||
|
Works with Facility team to conduct damage assessments. |
|
|
Assesses damage to supplies including tapes, miscellaneous equipment, support media and materials. |
|
|
Sends detailed damage assessment lists to Procurement team to arrange for replacements. |
|
Procurement team consults with Facility, Hardware and Logistics teams during damage assessment: |
||
|
Facility team conducts detailed assessment of primary data center structure and utility equipment:
|
|
|
Hardware team conducts detailed assessment of network and infrastructure equipment:
|
|
|
Logistics team assesses damage to furniture and supplies:
|
|
|
Insurance team manages financial loss:
|
|
|
Finance team:
|
|
|
Additional comments
|
|
Phase 3: Recovery |
||
Administrative Director oversees Administrative teams during recovery. Procurement team purchases replacement hardware |
||
|
Receives detailed damage assessment lists from Hardware, Facility and Logistics teams. |
|
|
Reviews State of Wisconsin, University System and UW-Madison contract records to determine current contract vendors. |
|
|
Coordinates team member activities to minimize duplication of vendor contacts. |
|
|
Reviews any quick-ship contracts DoIT may have established as a result of individual service component recovery plans.
|
|
|
Divides all equipment for which there is no quick-ship contract into three categories based on the following definitions:
|
|
|
Establishes priorities for acquiring equipment. |
|
|
Contacts vendors to determine availability of equipment, costs, and expected delivery dates.
|
|
|
Category 3 Equipment: Once a vendor has been identified, the Hardware team leader provides Procurement team with detailed list of equipment to be ordered, identifying it as a category 3 purchase. Procurement team then expedites the Governor's waiver process and proceeds with acquisitions after waivers are granted. |
|
|
For equipment purchased on the used market, Procurement team ensures that third-party vendors provide installation contracts and certify equipment to be fully operable and eligible for maintenance agreements. |
|
Logistics team arranges equipment and supplies transport: |
||
|
If equipment is to be moved off-site, Logistics team leader secures temperature-controlled moving vans for equipment transfer. |
|
|
Delivers offsite materials
|
|
|
Transports salvageable furniture and supplies to recovery site. |
|
|
Additional comments
|
|
Ongoing: |
||
Administrative Team Director reports to and is a part of the Executive Management Team. See Executive Management checklist for details |
U. Finance checklist detail
Finance checklist detail - also see KB 97319
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial notification and response |
||
Incident response team leaders prepare teams: |
||
|
||
Assemble teams at suitable locations. |
||
Report situation. |
||
Review team responsibilities and functions. |
||
Prioritize and direct next actions. |
||
Additional comments: |
||
Phase 2: Assessment and activation |
||
Finance team: |
||
Manages incident response expenditures. |
||
Accounts for incident response expenses. |
||
Additional comments
|
V. Insurance checklist detail
Insurance checklist detail - also see KB 97324
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial notification and response |
||
Incident response team leaders prepare teams: |
||
|
|
|
|
Assemble teams at suitable locations. |
|
|
Report situation. |
|
|
Review team responsibilities and functions. |
|
|
Prioritize and direct next actions. |
|
Incident response teams communicate with stakeholders: |
||
|
Insurance team handles communications with Risk Management. |
|
|
Additional comments:
|
|
Phase 2: Assessment and activation |
||
Insurance team manages financial loss detail: |
||
|
Coordinates damage assessment with Facility team. |
|
|
Works closely with Risk Management on any expenditures related to the claim to ensure that knowledgeable decisions are made.
|
|
|
Conducts a loss determination for building repairs and damaged equipment including depreciation, departmental labor, and supplies. |
|
|
Additional comments
|
|
W. Logistics checklist detail
Logistics checklist detail - also see KB 97335
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial notification and response |
||
Incident response team leaders prepare teams: |
||
|
|
|
|
Assemble teams at suitable locations. |
|
|
Report situation. |
|
|
Review team responsibilities and functions. |
|
|
Prioritize and direct next actions. |
|
|
Additional comments:
|
|
Phase 2: Assessment and activation |
||
Logistics team assesses damage to furniture and supplies: |
||
|
Works with Facility team to conduct damage assessments. |
|
|
Assesses damage to supplies including tapes, miscellaneous equipment, support media and materials. |
|
|
Sends detailed damage assessment lists to Procurement team to arrange for replacements. |
|
|
Additional comments
|
|
Phase 3: Recovery |
||
Logistics team arranges equipment and supplies transport: |
||
|
If equipment is to be moved off-site, Logistics team leader secures temperature-controlled moving vans for equipment transfer. |
|
|
Delivers offsite materials
|
|
|
Transports salvageable furniture and supplies to recovery site. |
|
|
Additional comments
|
|
X. Procurement checklist detail
Procurement checklist detail - also see KB 97374
Date/Time completed |
Tasks |
Point of contact |
---|---|---|
Phase 1: Initial notification and response |
||
Incident response team leaders prepare teams: |
||
|
||
Assemble teams at suitable locations. |
||
Report situation. |
||
Review team responsibilities and functions. |
||
Prioritize and direct next actions. |
||
Incident response teams communicate with stakeholders: |
||
Insurance team handles communications with Risk Management. |
||
Additional comments: |
||
Phase 2: Assessment and activation |
||
Procurement team consults with Facility, Hardware and Logistics teams during damage assessment: |
||
Facility team conducts detailed assessment of primary data center structure and utility equipment: |
||
Uses facility damage checklists generated from the Configuration Management Data Base (CMDB) and the Facility Planning & Management facility database. |
||
Avoids further damage to the site(s) and utility equipment. Ensures Risk Management has noted damage to equipment before it is moved. |
||
Assesses structural damage to data centers. |
||
Conducts assessment of individual equipment components to determine which items will be used or salvaged. Consults with Procurement team as necessary. |
||
Estimates replacement and repair time including ordering, shipping, installing and testing equipment. |
||
Tags all usable equipment. |
||
Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company. |
||
Reports findings to Executive Management Team. |
||
Sends detailed damage assessment lists to Procurement team. |
||
Hardware team conducts detailed assessment of network and infrastructure equipment: |
||
Conducts assessment of individual service components to determine which items will be used or salvaged. Consults with Network, Infrastructure and Procurement teams as necessary. |
||
Estimates replacement and repair time including ordering, shipping, installing and testing equipment. |
||
Tags all usable equipment. |
||
Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company. |
||
Reports findings to Executive Management Team |
||
Sends detailed damage assessment lists to Procurement team |
||
Logistics team assesses damage to furniture and supplies |
||
Works with Facility team to conduct damage assessments. |
||
Assesses damage to supplies including tapes, miscellaneous equipment, support media and materials. |
||
Sends detailed damage assessment lists to Procurement team to arrange for replacements. |
||
Additional comments |
||
Phase 3: Recovery |
||
Procurement team purchases replacement hardware: |
||
Receives detailed damage assessment lists from Hardware, Facility and Logistics teams. |
||
Reviews State of Wisconsin, University System and UW-Madison contract records to determine current contract vendors. |
||
Coordinates team member activities to minimize duplication of vendor contacts. |
||
Reviews any quick-ship contracts DoIT may have established as a result of individual service component recovery plans.
|
||
Divides all equipment for which there is no quick-ship contract into three categories based on the following definitions:
Hardware team leader must seek concurrence of the IT Service Recovery Team Director and the Procurement team leader for the purchase of Category 3 equipment. The negotiation/acquisition process for such equipment also requires that a Governor's waiver be granted. |
||
Establishes priorities for acquiring equipment. |
||
Contacts vendors to determine availability of equipment, costs, and expected delivery dates.
|
||
Category 3 Equipment: Once a vendor has been identified, the Hardware team leader provides Procurement team with detailed list of equipment to be ordered, identifying it as a category 3 purchase. Procurement team then expedites the Governor's waiver process and proceeds with acquisitions after waivers are granted. |
||
For equipment purchased on the used market, Procurement team ensures that third-party vendors provide installation contracts and certify equipment to be fully operable and eligible for maintenance agreements. |
||
Additional comments |
VI. COOP Record Information
The DoIT COOP relies on expertise in many areas for its content. Source of reference and authority are outlined. Additionally, this section records when COOP updates have been made and the plans for COOP maintenance.
A. Authorities and References
FEDERAL AND COLLEGE AUTHORITIES
The DoIT Continuity of Operations Plan is guided by the following federal and college authorities.
- National Infrastructure Protection Plan - NIPP
- FEMA Continuity Guidance Circular
- NFPA 1600 Standard on Disaster/Emergency Management and Business Continuity Programs
- UW-400 University Response Plan (URP) - Police Department Policy 15.1
- UW System Policy 1033, Information Security: Incident Response
RESPONSE OPERATIONS REFERENCES
Additional planning documentation that guide response operations and synergize with the Plan include:- UW-Madison University Response PlanUWPD Emergency Exercise information
- UW-Madison Emergency Procedures Guide 2016
- Occupant Emergency Plans (OEPs) for DoIT-occupied locations. See Appendix G in this documnet for DoIT locations.
- DoIT ITSM tool WiscIT Powered by Cherwell, available via desktop client or via web
- DoIT Offsite Cloud Storage documents on BOX, which individuals can set up for their own Out-of-Band access methods
- DoIT Outage Communications checklist
- DoIT Emergency Contact List
- DoIT Offsite Alternative Work Location info
- DoIT Network Services COOP Plan
- DoIT Agreements and Vendors list
- DoIT Data Center Restart Order list
- DoIT Procedures for Declaring an Outage based on 1 or more Help Desk incidents
B. Record of COOP changes
Revision Date | Description of Change | Implemented by |
---|---|---|
Aug 2024 |
Updated |
|
Nov 2023 |
Updated |
|
Feb 2023 |
Updates to reflect DoIT Communications style guidelines and general updates |
J. Sutherland |
Oct - Dec 2022 |
Updates per DoIT Directors' requests
|
J. Sutherland |
Apr 2021- Jun 2021 | Draft - Adaptation of existing DoIT COOP Plan to new format. | J. Sutherland |
July 2021 | Draft - Revision of format, images, document flow | J. Sutherland |
Aug 2021 | Draft - Modified to include Cybersecurity sections. TOC updated. | J. Sutherland |
Sep 15, 2021 | Draft - Incorporation of UW System procedure 1031.B. for High Risk Data | J. Sutherland |
Dec 10, 2021 | Draft - Conversion to DoIT KnowledgeBase format | J. Sutherland |
Dec 16, 2021 | Draft - sections consolidated/re-ordered for flow; expanded detail in Order of Notification. | J. Sutherland |
Jan - Apr, 2022 | Draft - Updated links, standardized formats, organization for optimized KB presentation. Incorporated updates from key COOP stakeholders. | J. Sutherland |
C. COOP / Business Continuity Plan Maintenance
Activity |
Led By |
Frequency |
---|---|---|
Review and update the COOP facility location data |
COOP Coordinator |
Annually |
Review and update DoIT staff contact information |
HR and Admin Assistants with COOP Coordinator |
At least 1x annually |
Review and update contact information for response partners, vendors, and continuity facilities. |
COOP Coordinator with Configuration Manager |
Quarterly |
Maintain electronic versions of COOP that DoIT staff can access both onsite/offsite. |
COOP Coordinator |
Annually |
Revise COOP to address any identified gaps following activation or exercise (TTX) |
COOP Coordinator | Upon completion of an exercise or real-world disruption |
Usage Type |
Date |
Scenario or Disruption Type
|
---|---|---|
Tabletop Exercise |
2023-11-16 |
Campus Active Directory cybersecurity compromise affecting network an all cloud-based IT tools that use single-sign-on (e.g. Zoom, Canvas, Google, Microsoft email/Teams, Webex, etc ) |
Tabletop Exercise |
2023-03-07 |
Ransomware event encrypting Research Drive data. |
Tabletop Exercise |
2022-04-19 |
Data Center flooding from nearby City water main break |
Activation |
2021-09-10 |
Wireless Outage from vendor code issues |
Activation |
2020-03-12 - 2020-03-23 |
COVID-19 Pandemic and initial response transforming UW-Madison to remote learning and business |
Tabletop Exercise |
2019-03-07 |
Train Derailment, Toxic Spill, and Resulting Building Evacuation |
Activation |
2019-01-30 |
Campus Closure and EOC activation due to Extreme "Polar Vortex" Weather Conditions |
Tabletop Exercise |
2018-10-24 |
Pandemic creating Staff Shortage |
Tabletop Exercise |
2018-05-15 |
Participation in "Dark Skies" cyber-attack scenario - a multi-county tabletop exercise (Brown, Calumet, Dane, Fond du Lac, Milwaukee, Outagamie and Winnebago Counties) which tested the abilities of private utilities, law enforcement, first responders and the National Guard to respond to the scenario as well as its second and third order effects. |
Tabletop Exercise |
2018-01-05 |
Long Term Power Outage due to Multi-County Cyber Attack |
Tabletop Exercise |
2016-09 |
Building Fire Alarm triggers Sprinkler Flooding |
Activation |
2015-09-26 |
Dual Power Feed Outages while UPS generator undergoing maintenance |
Activation |
2014-06-18 |
Power Outage due to Lightning Strike disabling UPS and Power |
Documentation and artifacts from COOP activation PIRs is located on the DoIT wiki here.
See Also
- DoIT Operational Framework - Section 1.0 - Overview
- DoIT Operational Framework - Section 2.0 - Glossary of Significant Terms
- DoIT Operational Framework - Section 3.0 - Change Management
- DoIT Operational Framework - Section 4.0 - Incident Management
- DoIT Operational Framework - Section 5.0 - Configuration Management
- DoIT Operational Framework – Section 6.0 - Event Management
- DoIT Operational Framework - Section 7.0 - Problem Management
- DoIT Operational Framework (All Sections)
- Working with the Operational Framework (Policy)
- DoIT IT Service Management (ITSM) Resources
- The DoIT Operational Framework, ITIL & Service Management Contacts at DoIT
- Service Tiers (COOP Tiers)