DoIT Operational Framework - Section 8.0 - Continuity of Operations (COOP)

Each school, college and division within the University of Wisconsin System is required to provide a Continuity of Operations Plan (COOP). The campus-wide effort to develop these plans is led by the UW Police Department, with delegated responsibility for completion of the respective COOPs. To comply with COOP requirements, DoIT updated and expanded its Business Continuity Plan, which has a broader scope than the COOP. Though the distinction between these terms is important from a project management perspective, for the purposes of the documentation presented here the terms COOP and Business Continuity Plan are used interchangeably throughout. This document incorporates DoIT’s COOP and DoIT’s Business Continuity Plan.

This Continuity of Operations Plan is for internal DoIT use and UW-Madison Police Department use only. Please contact the COOP Coordinator for a redacted version if needed for any other purpose.

The DoIT Continuity of Operations Plan (COOP) (the Plan) has been developed to protect the well-being of staff, students, and visitors and to mitigate risk to the continued operations of essential IT services and IT functions under disruptive emergency conditions for both campus and DoIT.

In conjunction with expertise from the University of Wisconsin - Madison Police Department (UWPD), the Plan provides DoIT with a framework for business continuity and establishes guidance for essential IT service execution in the face of a natural or human-caused threat to or major disruption of IT Service operations. 

I. Introduction

The DoIT Continuity of Operations Plan (COOP) (the Plan) has been developed to protect the well-being of staff, students and visitors, as well as to mitigate risk to the continued operations of essential IT services and IT functions under disruptive emergency conditions for both campus and DoIT.

In conjunction with expertise from the University of Wisconsin–Madison Police Department (UWPD), the Plan provides DoIT with a framework and guidance for business continuity in the face of a natural or human-caused threat or major disruption of IT service operations for any reason.

 

A. Executive Overview

Executive Overview

The Plan includes policies, practices, and procedures to ensure the continuity of DoIT operations before and after disruptive emergency conditions. Provisions of the Plan include:

»  Emergency procedures to follow in the event of a major disruption:

    • Event detection
    • Human resource safety and support
    • Property protection and security
    • Communications and overall direction of recovery procedures

» Organization of teams to manage response and recover IT Service operations

» Prioritization criteria for the recovery of IT services - Service Tiers aka COOP Tiers

» Framework for support and recovery of all systems, applications, data and network services after a disruptive emergency

The Plan applies the designated prioritization criteria of Service Tiers (also known as COOP Tiers) to establish DoIT service priorities. These service priorities form a basis for identification of service needs and resolution goals as defined in the DoIT Operational Framework – Section 4 – Incident Management in Section 4.2 - incident Response Guidelines, capabilities for meeting these goals, and any associated gaps.  Remediation of gaps, including prioritization of projects and associated funding, are a responsibility of DoIT leadership in consultation with DoIT governance groups and partners/customers.


Throughout this documentation, the terms COOP, and Business Continuity Plan and the Plan are used interchangeably.

B. Audience

Audience

The primary audiences for the Plan include DoIT management, technologists, and business administrators responsible for ongoing plan implementation, maintenance, testing, training and activation. Secondary audiences include customers of DoIT products and services, University administration, University Police responsible for Continuity of Operations Planning (COOP) and internal auditors.

C. Purpose

Purpose

DoIT provides a wide range of technology services which are critical to the successful operation of the University. It is essential that DoIT ensures the availability and reliability of critical IT services within its purview.

The goals of the COOP are to:

» Provide for the safety of DoIT employees, students, and visitors in the event of an emergency

» Establish infrastructure for communications, command center, and alternate site work for DoIT, where needed, to support timely IT service recovery

» Focus on the Plan procedures necessary to maintain or resume IT services within reason and priority, thereby minimizing impact to campus operations and end users

» Prepare and provide advanced information and education to DoIT employees regarding their roles and responsibilities following an emergency disruption of DoIT operations

» Inform DoIT leadership for emergency IT service management and coordination

» Protect hosted data should an emergency disruption of operations occur

» Protect and minimize the potential loss of property, assets and resources during an emergency disruption

During an emergency, it is critical that DoIT prioritizes its activities to focus on the safety and welfare of its personnel while providing continuity of essential DoIT IT Services and high-risk data. The prioritization of essential IT functions may require delaying the restoration of non-essential IT functions. Services deemed essential IT functions may vary according to the nature of the disruptive emergency and campus needs.

The Plan delineates roles, responsibilities, and processes to be followed immediately after a major IT service disruption. The Plan is designed to be as threat-independent as possible, while at the same time allowing for flexibility when threat-specific response is needed.

All DoIT service management data pertinent to COOP is exported quarterly to an offsite secure cloud location. Critical information, such as the most current prioritized list of DoIT services, key vendor information and staff contact information is stored in three locations:

» In the DoIT Cherwell IT Service Management application branded as WiscIT

» in a cloud storage location, on BOX

» on USB flash drives, distributed to key roles

D. Applicability and Scope

Applicability and Scope

This overarching COOP documents a broad recovery framework for DoIT services and associated infrastructure. The scope of this Plan extends to any major event that threatens or disrupts normal DoIT service operations, DoIT data centers, network supernodes, and other critical IT infrastructure including critical IT services provided by 3rd parties, regardless if the disruption is man-made, technological infrastructure or natural disaster.

While the COOP framework may be used as a model for identifying disaster recovery requirements for each specific DoIT service, individual DoIT IT services may have business continuity plans specific to their areas. Recovery plans for each specific IT service are the responsibility of IT service owners, project managers and technical team leads. These should be referenced for more detailed information about respective operations.

Not all disruptions are managed through this Plan. Problems with a likely IT root cause impacting DoIT services are managed by the DoIT Operational Framework Problem Management procedures. Major disasters that have a broad impact on the campus at large, including health, safety, or order, are covered by UW-Madison Emergency Management procedures.

IT Service Problems which potentially jeopardize staff safety, physical access to or structural integrity of the data center facility, physical integrity of equipment, or major power outages of extended duration may constitute a need to activate the DoIT Continuity of Operations Plan.

E. Assumptions

Assumptions

The DoIT Plan is based on a realistic approach to problems likely to be encountered during a major disruption or emergency. The assumptions listed below should be used as general guidelines in such an event.

  • TIMING - An emergency or a disaster requiring activation of the COOP plan may occur at any time of the day or night, weekday, weekend, holiday, with little or no warning.
  • VARIABILITY - It is assumed that every emergency is different, and a document such as the COOP Plan may not be able to be followed to the letter for each and every COOP activation. The Plan is meant as a guide for the UW-Madison Division of Information Technology.
  • NOT PREDICTABLE - The succession of events in an emergency or disaster is not predictable; therefore, published operational plans, such as this plan, should serve only as a guide and a checklist.
  • DEVELOPING CONDITIONS - An emergency or a disaster may be declared if information indicates that such conditions are developing and probable.
  • COMMUNITY - Disasters may be community-wide. Therefore, it is necessary for DoIT to plan for and carry out disaster response and short-term recovery operations in conjunction with other campus and local resources.
  • COMMUNICATION TOOLS - Communication and collaboration tools are available
  • ELECTRICITY - There are still some power resources available for IT infrastructure on the campus.
  • SERVICE LOSS - There is a significant or total loss of one or more of the essential services or infrastructure of DoIT
  • ALTERNATE SITE - One of the DoIT COOP alternate operation facilities is functional.
  • STAFFING - There is a significant, not total, loss of DoIT staff. DoIT staffing levels remain high enough to continue core essential IT services.
  • NETWORK - There is a significant, not total, loss of the DoIT administrative or information technology networks.
  • ASSISTANCE TO CAMPUS - The major role of DoIT in a COOP activation would be to assist and aid other campus and potential community groups in the event of an emergency. DoIT manages IT resources that can assist in campus disruptions.
  • FACILITY AVAILABILITY - Most of the facilities on campus have not been affected by the disaster.

II. COOP by Phase

COOP - Process Overview

Response activities to an emergency disruption can be classified in four phases.

COOP High Level flow of phases

Phase 1.0 - COOP Activation

PHASE 1.0 - Activation

COOP Activation - phase1 flow

The Activation phase includes procedures for disruption impact assessment, COOP activation, notification, leadership and initial actions. The following provides information for determining when and how to activate the Plan and provide immediate guidance to relevant parties.

DoIT COOP activation scalability

COOP activation is scalable. The extent of Plan activation depends on the scope of the disruption. In some cases, only specific continuity personnel may be activated to perform the tasks associated with impacted IT services, or all personnel that are telework-capable may be activated, while work that necessitates being onsite continues. In other cases, the disruption may warrant a full activation, where all personnel must conduct their activities offsite. Each COOP activation will be disruption-specific according to the impact or potential impact.

Levels of COOP activation are provided in Table 1.

  •  Table 1: Levels of COOP activation
    • Table 1: Levels of COOP Activation

      Level of Activation

      Action

      Examples

      Partial Activation

      Partial activation may focus on

      »      Assessing, servicing, restoring, or supporting impacts to building infrastructure or equipment, such as generators or back-up power

      »      Activation of Lines of Succession in the event that Continuity Personnel are on vacation, sick, or otherwise unavailable

      »      Impacts to a portion of a DoIT data center that do not impact the health or safety of building occupants

      »      Availability of IT resources or other IT related activities, such as retrieving back up files from other sites, or relying on manual documentation

      »      Impending winter weather event

      »      Loss of campus internet or network access

      »      Continuity Personnel are unavailable due to business travel

      Full Activation

      »      Complete evacuation of a DoIT office building or  facilities closure to non-continuity personnel or the implementation of lines of succession

      »      Events requiring containment, decontamination, and risk communications

      »      Long-term impacts to DoIT building infrastructure

      »      Significant events on or off campus resulting the unavailability of large numbers of staff

      »      Large-scale influenza epidemic

      »      A Hazardous Materials event

      »      Building fire resulting in significant damage

       

    DoIT COOP coordination in multi-tenant buildings

    DoIT will coordinate with other occupants of DoIT buildings regarding activation of their respective COOP. In situations where there is no impact to health and safety of buildings that DoIT occupies, DoIT may continue normal operations, or conduct a partial activation.

    Notification

    Initial notification procedures are critical to personnel safety and orderly response to a major problem. The Senior Systems and Network Control Center (SNCC) staff on duty, in consultation with the SEO Duty Manager and the Systems Engineering & Operations (SEO) Director, are responsible for following data center notification procedures as described above in section 1.1.1 of this DoIT COOP document.

    Order of Notification is detailed in Appendix C: Order of Notifications in this document.

    If an immediate danger exists within a DoIT building, a member of the Incident Management team will immediately notify the Building Manager that an incident has occurred, and action is being taken to respond to the incident. Building occupants should be directed according to hazard specific procedures outlined in the UW-Madison Police Emergency Procedures Guide. Note that certain incidents (e.g. a fire or active shooter event) may require immediate response/direction. 

    Incident leadership and command center

    When a physical location for managing incidents is required, the Executive Management Team (EMT) designates a gathering space, or “command center”, where the team can coordinate activities - see Table 2: Command center locations. The command center should have available all necessary supplies, maps, key contact information, telephone lines and computers. The command center should also contain stand-alone copies of institute emergency policies and procedures.

    • Table 2: Command center locations

      Command Center Location

      List of Stored Supplies

      Primary Location

      DoIT location, communicated to responding staff, if secured from major disruption

      »      Facility Maps

      »      EOP or Evacuation Policies and Procedures

      »      Laptops and telephone access

      »      Staff contact information

      Secondary Location

      Office suite in recovery site building

      »      Facility Maps

      »      EOP or Evacuation Policies and Procedures

      »      Laptops and telephone access

      »      Staff contact information

    Orders of Succession

    During COOP operations, if members of the Executive Management Team or Incident Response team are incapable or unavailable to fulfill essential duties, successors have been identified to ensure there is no lapse in essential decision-making authority.  For more information see Appendix A: Orders of Succession.

    Delegation of Authority

    Members of the Executive Management Team or Incident Response team leaders have the authority to make critical decisions regarding response or continuity operations. Executive Management Team or Incident Response team leaders can designate individuals or positions, and their successors, to make these decisions in their stead if necessary. For more information see Appendix B: Delegation of Authority.

    Communications

    Communications and initial notification procedures are critical to personnel safety and orderly response to a disruptive emergency.

    For all detailed procedures, see

     

    Delivering messaging during a COOP activation

    During a COOP action, continuity personnel providing messaging must

    • Ensure that all external communication occur through the DoIT Communications Director or delegate
    • Coordinate communication activities at facilities with UW Police, campus, DoIT Operations, and Cybersecurity
    • Ensure internal and external communications are accurate, timely, and informative
    • Provide frequent updates to staff and building occupants to mitigate concerns and manage expectations
    • Share only known/confirmed information (i.e., do not speculate)
    • Use one unified voice to avoid confusion or misinformation

    DoIT Communications team should provide guidance to Staff on where to direct incoming inquiries from personnel, media, vendors, etc. Incoming inquiries should be tracked so that they can be contacted for follow-up.

     

    Notifications regarding a possible physical relocation will be sent to all continuity personnel by the Communications Consultant. Supervisors and managers will provide information to non-continuity personnel regarding relocation or possible long-term work shortages at the guidance of the Executive Management Team and Human Resources team.

    Public relations

    The DoIT Communications Director, in consultation with the Executive Management Team, is responsible for public relations and all communication to the public during tan active disruption.  DoIT COOP Communications Director/Consultant works in collaboration with the UW Emergency Management Unit and UW Communications. DoIT Communication can utilize the DoIT Outage Communications checklist located offsite on BOX for general guidance.


    Pre-Scripted Messaging

    DoIT Communications maintains templates of pre-scripted messages to be tailored and utilized as appropriate when DoIT COOP is activated. Communications media may include but is not limited to: UW–Madison emergency web pages, the UW-Madison landing page, the Outages pages, email, Facebook, Twitter, and emergency text messaging through the RAVE system.

    Internal Forms of Communication

    Communications internal to DoIT may vary depending on technologies available and best suited to the emergency disruption. These may include but are not limited to: MS Teams’ DoIT Operations channel, Google, Out-of-Band chat rooms, phone bridges, telephone and cell phone communications, message relays via telephone trees, conference calls, texting, email, and staff messengers (“runners”) when physical presence does not risk staff life/health/safety.

     

    COOP activation procedure detail

    Table 3: COOP Phase 1.0 - Activation Procedures

    COOP PHASE 1.0 - Activation

     

    1.1  Identify disruption as a potential COOP activation

    Responsibility

           

    1.1.1       Follows standard procedures in responding to a major disruption:

    • Detects major disruption(s) directly or via monitoring devices or alerts.
    • Assesses extent of major disruption. If emergency services are required:
      • Calls (9) 911 for emergency services and physical security.
      • Calls 3-3333 to arrange for Physical Plant facility services.
      • Evacuates self from the facility if the situation threatens physical safety. At no time should the physical safety of employees be jeopardized.
      • Building occupants should be directed to hazard specific procedures outlined in the UW-Madison Police Emergency Procedures Guide.
    • Manages emergency shutdown of services if possible and necessary.
    • Contacts SEO Duty Manager with a preliminary situation report.

    Senior Systems and Network Control Center (SNCC) Staff Member on duty

     

    1.1.2       Assesses personnel safety and resource needs

    • Initiates accounting for all personnel who were at any affected DoIT sites during any major disruption jeopardizing human safety and notifies Human Resources.
    • Establishes contact with DoIT leadership and/or staff at other DoIT sites regarding activation of their building’s respective COOP.
    • Establishes alternate communication channel if needed
      • If MS Teams DoIT Operations channel is disrupted, initiates chat communication via OOB Google Chat Room at http://oob.wisc.edu
      • If NetID authentication is disrupted, then log into MS Teams with non-NetID account at https://teams.microsoft.com used in OOB Chat drills.
    • Follows SEO Vertical Escalation & Situation Manager procedure and engages responsible Situation Manager
      • Informs UW Police if UWPD is not already involved
      • Triggers the emergency SMS message to Core Services Directors
      • SNCC includes “Emergency: meet now" in CS Director Teams Channel
      • Note: Directors should program the SNCC number into their phone with Bypass Do Not Disturb mode so that it makes noise even if the phone is on silent.
    • Engages SEO On-Call Technologist(s) per Office365 calendar for DoIT SEO On-Call)

    SEO Duty Manager

     

     

    1.1.3       Consults with SEO Director, if possible.

    SEO Duty Manager and SEO Duty Technologist

     

    1.1.4       Notifies the Building Manager that a major disruption has occurred.

    SEO Duty Manager

    1.2  Form Executive Management Team

    Responsibility

     

    1.2.1       Calls for an assembly of Executive Management Team (EMT) members at DoIT Emergency Operations Center (EOC).

    • Core Services directors and Deputy CIO immediately convene a call, discuss the situation, and determine who is best suited to perform/assume the DoIT CIC role in the given situation.

    • DoIT CIC will notify CIO Exec team of the situation and let them know who is filling the DoIT CIC role.

    • For situations involving physical locations, and when safety permits, DoIT CIC will travel to the impacted location.
    • In the unlikely event that none of the deputy CIO or CS directors are available, DoIT CIC will transfer to the ITSM Associate Director, who will need to communicate directly with CIO Exec Mgmt team.  The ITSM Associate Director will hand DoIT CIC back when a CS Director or Deputy CIO becomes available.

    • DoIT EOC location
      • may be virtual, via Secure MS Teams or Google room EOC Channel per the Out-of-Band (OOB) chat procedures, or via other unanimously-agreeable medium.
      • will be located at specific DoIT location, if secure
      • Alternate location: office suite at the recovery site.

    SEO Director, any Core Director, or designate

     

    1.2.2       Contacts Building Manager at recovery site for appropriate access to the building and action is being taken to respond to the major disruption.

    SEO Director with SNCC Manager

    1.3  Gather Information and Analyze Conditions

    Responsibility

     

    1.3.1       Activates damage assessment procedures

    SNCC Manager or SEO Director

     

    1.3.2       Leads team to assess impact to IT services

    SNCC Manager or SEO Director

    1.4  Establish COOP command

    Responsibility

     

    1.4.1       Notifies Deputy CIO of possibility that DoIT COOP Incident Commander (CIC) role may be needed..

    SEO Director or designate

     

    1.4.2       Establishes a physical or virtual command center

    DoIT COOP Incident Commander (CIC)

     

    1.4.3       Summons the Executive Management Team

    DoIT COOP Incident Commander (CIC)

     

    1.4.4       Ensures the engagement of the Communication Consultant.

    Executive Management Team (EMT)

     

    1.4.5       Activates Human Resources team to support staff and families.

    DoIT COOP Incident Commander (CIC)

     

    1.4.6       Notifies Insurance team lead and campus Risk Management.

    DoIT CFO or Financial Services

    1.5  Determine COOP activation Level

    Responsibility

     

    1.5.1       May take recommendations to DoIT CIC on partial or full COOP activation

    SNCC Manager or SEO Director

     

    1.5.2       Activates COOP and determines level of COOP activation. See criteria for COOP activation level in Table 1 above. Levels of COOP activation include:

    »    Partial Activation: is a scaled COOP activation that aligns with the scope of the continuity event and the impacts to DoIT essential IT services. Partial COOP activation should include activation of specific continuity personnel to complete their continuity related responsibilities, though the full continuity team should receive notification of the partial activation. The DoIT Executive Management Team will determine if additional notifications to DoIT management or staff are necessary.

    »    Full Activation: includes notification and activation of all continuity personnel to complete responsibilities identified in operation of essential IT services, as well as notification of the DoIT Executive Management Team and communication with all non-continuity personnel of current actions and priorities.

    1.5.2.1   Full COOP activation for Class 1. Class 1 includes

    »   Facility damage or facility access,

    »   Extensive or potential physical damage and/or danger, extensive or medium duration.

    »   Examples: fire, flooding, explosion, terrorist threats, severe weather, train derailment, pandemic, hazard material event, building fire resulting in significant damage

    1.5.2.2   Partial COOP activation for Class 2. Class 2 includes:

    »   Power outage, minimal physical damage or danger, widespread hardware or software attack from computer virus or hacker, potentially extensive loss of multiple IT services. Medium duration

    1.5.2.3   No COOP activation for Class 3 – use DoIT Problem Management process instead. Class 3 includes

    »   Localized hardware or software attack from computer virus or hacker, potentially loss of IT services, medium or short duration

    DoIT COOP Incident Commander (CIC)

     

    1.5.3       Decides the situationally-appropriate medium for DoIT internal communications

    DoIT COOP Incident Commander (CIC)

     

    1.5.4       Will coordinate with other occupants of DoIT buildings regarding activation of their respective COOP.

    SNCC Manager or SEO Director

    1.6  Form COOP teams

    Responsibility

     

    1.6.1       Activates the Incident Response teams appropriate for the disruption and ensures that teams are properly staffed, with consideration given for preparing/pipe-lining “next shift” staff and for staff nutrition

    Executive Management Team (EMT)

    1.6.2       Based on IT service impact of the disruption and COOP activation level, DoIT COOP Incident Commander (CIC) forms

    »   IT Service Recovery team

    »   Administrative team

     

     

    1.6.3       The Incident Management team will initiate response actions that are appropriate to the severity of the incident.

     

Phase 2.0 - COOP operation

PHASE 2.0 - Operation

This section discusses the necessary actions, parameters and considerations following the COOP activation phase.  


COOP Operation - phase2 flow

Duration

Continuity of operations covers the span of 12 hours after the incident up to 30 days while essential IT functions are being restored and conducted.

Execution of essential IT functions

Once the Plan has been activated and all personnel have been notified of their roles and responsibilities, assigned staff will commence continuity operations to deliver essential functions. Essential IT services must be maintained with little to no interruption. IT services which have been considered essential in previous COOP activations include services listed in Table 4.

Table 4: Essential Staff Positions supporting Essential IT Services

Name

Definition

Help Desk

Help Desk services should expect increased volumes during COOP activation to support online learning tools, VPN, etc.

SNCC

SNCC should expect to be considered essential staff during COOP activation. SNCC provides communication between UWPD, technical staff, DoIT Management, and the general campus via the Outages page.

Critical Infrastructure/Life Safety

Critical infrastructure/Life Safety services are paramount  during COOP activation

Canvas and other LMS tools

Canvas and other LMS tools are essential during COOP activation to support he overall mission of the University

Cybersecurity

Cybersecurity is mission critical to maintain the safety,  integrity and availability of University data during a COOP activation.

NetID Login authentication

NetID authentication enables protection to UW–Madison IT infrastructure and data. 

Email

Email is considered an essential IT service due to heightened communication needs during COOP activation.

Email Lists

Email Lists are considered an essential IT service due to heightened communication needs during COOP activation.

DoIT OnCall Rotations

Staff from all DoIT on-call rotations (in SE, ITSM, A&SE, Data Center, AIS, NS, US) could be required during a COOP activation.

UW-Madison Police Department (UWPD)

UWPD notification should be considered essential for any DoIT COOP activation.

For reference, a spreadsheet of DoIT Essential Employees was developed in March 2020.

Alternate individuals have been identified as support to assist with maintaining essential IT services if the primary responsible party is unable to fulfill their duties. This support information is stored in WiscIT.

Execution of other IT services

Each DoIT group will continue operations to the extent possible at their designated continuity facility. DoIT groups that can perform their works remotely should implement appropriate telework procedures.

Communications

The DoIT Communications Director and Communications team will continue to manage the activities outlined in Phase 1.0 - COOP activation.

COOP Operation Procedure Detail

Table 5: COOP Phase 2.0 - operation procedures detail

COOP PHASE 2.0 - Operations

 

2.1  Commence continuity operations by assigned staff to deliver essential functions

Responsibility

 

2.1.1       Contact team members & log contact

»   Contact team members and maintain COOP Log of contacts made, in Appendix H.

»   Set response operation periods and objectives (typically 12-24 hours but potentially less depending on the scale of the event). Set objectives at the outset of each operational period.

Incident Response team leaders

2.2  Prepare teams

Responsibility

 

2.2.1       Prepare teams:

»   Contact team members and maintains Log of contacts made

»   Assemble teams at suitable locations.

»   Report situation.

»   Review team responsibilities and functions.

»   Prioritize and direct next actions.

Incident Response team leaders

2.3  Communicate with key stakeholders

Responsibility

 

2.3.1       Communicate with stakeholders

Incident Response team leaders

2.4  Establish operations from continuity facility

Responsibility

 

2.4.1       Each DoIT group will continue operations to the extent possible at their designated continuity facility. DoIT groups that can perform their works remotely should implement appropriate telework procedures.

If continuity requires relocation to an identified continuity facility (see the DoIT Offsite Alternate Work Location Info) or other facility, the following systems and documents may need to be available to ensure continuity personnel can maintain communications and access essential records and information:

»   Access management

»   A local area network

»   Internal and external email and email archives

»   Both electronic and hard copy versions of essential records (stored off site)

Supervisors and managers will provide relocation information or dismissal instructions to non-continuity personnel at the guidance of the Incident Response team

DoIT groups

 

2.4.2       Notify staff of alternate facility

»   Notifications regarding the relocation to an alternate facility will be sent to all continuity personnel by the Communications Consultant

»   Supervisors and managers will provide information to non-continuity personnel regarding relocation or possible long-term work shortages at the guidance of the Executive Management Team and Human Resources team

 

 

2.4.3       Identify staff rotations

»   Alternate individuals have been identified as support to assist with maintaining essential IT services if the primary responsible party is unable to fulfill their duties. This information can be found in WiscIT

 

2.5  Continue to monitor major disruption impact on essential IT services

Responsibility

 

2.5.1       When disruption or threat has been eliminated, initiate RECONSTITUTION phase

DoIT CIC

 

Phase 3.0 - COOP Reconstitution

PHASE 3.0 - RECONSTITUTION

COOP Reconstitution - phase3 flow

Reconstitution is the process of terminating Plan operations and resuming all essential functions and other activities carried out by DoIT. Reconstitution operations may include:

  • Deactivating the continuity or alternate facility.
  • Returning equipment, records, and personnel to either the original or a replacement primary site.
  • Returning to normal operations.

Planning for reconstitution should occur as soon as continuity operations are activated. Reconstitution will commence as soon as the emergency incident concludes. Note that in certain cases the facility may sustain serious physical impacts, and there may be a significant delay for repairs. In this event, the institute may reopen in a phased manner consistent with the ability to establish essential functions. The following section will serve as a guide to prepare for reconstitution operations.

COOP reconstitution procedure detail

Table 6: COOP Phase 3.0 - Reconstitution Procedures

COOP PHASE 3.0 - Reconstitution

 

   3.1  Commence reconstitution phase

Responsibility

 

3.1.1       Coordinate teams to terminate COOP operations and begin reconstitution by sending notification to continuity personnel.

Incident Executive Leadership team and Incident Mgmt team

 

3.1.2       Appoint a reconstitution team

Incident Management team

3.2  Validate facility safety for staff return

Responsibility

 

3.2.1       Prior to re-entering any DoIT-occupied Building, the Executive Management Team will work collaboratively with the Facilities team and UW-Madison FPM to ensure that actions are completed appropriately, and that personnel safety is not at risk.

Incident Response team leaders

 

3.2.2       In the event the facility sustains severe physical impact, DoIT may reopen in a phased manner consistent with the ability to establish essential and prioritized IT services

 

3.3  Develop reconstitution plan

Responsibility

 

3.3.1       Reconstitution team develops a detailed move plan to ensure the orderly return to the normal operating facility or move to another operating facility.

Incident Response team leaders

 

3.3.2       Assign a team to handle final preparations at site. This team should develop a checklist of areas to be inspected and verified before the move (i.e., space configurations, proper functioning of equipment and PCs, heat/cooling, electricity, telephone and computer connectivity, etc.)

 

 

3.3.3       Conduct an assessment to determine if any validation tests are necessary.

 

 

3.3.4       Notify Personnel of Reconstitution

 

 

3.3.5       Follow procedures to ensure a timely and efficient transition of communications, direction and control and transfer of vital records and databases to primary facility, adhering to data handling and security protocols per UW System procedure 1031.B. for High Risk Data.

 

 

3.3.6       Arrange to have any necessary supplies and equipment moved from alternate site

 

3.4  Resume operations at permanent site.

Responsibility

 

3.4.1       Each DoIT group will continue operations.

DoIT groups

 

3.4.2       Develop a plan for returning the alternate space being used to its normal occupants.

  • Identify any work backlogs that may have developed. Develop a plan on how to address the backlogs

DoIT groups

3.5 After Action Report (AAR)

Responsibility

3.5.1       Assemble key participants to review COOP process and/or After Action Report (AAR) for process improvement

DoIT COOP Coordinator

Close COOP

 

 

After all services are restored, DoIT COOP Coordinator completes and reviews the following checklists for post-problem analysis and Continuous Improvement review:

  • Reconstitution process log
  • Deactivation checklist
  • After-Action worksheet

Phase 4.0 - COOP Readiness and Preparedness

PHASE 4.0 - Readiness and preparedness

DoIT will maintain a state of readiness through regular preparedness activities. This includes

  • maintaining operational documentation such as the Plan
  • reviewing and updating the Plan on an annual basis and after major disruptions,
  • socializing continuity procedures among personnel
  • training and exercising the Plan regularly with personnel via Tabletop Exercises (TTX)
  • ensuring that IT service status and IT service component changes/inventories are maintained in the WiscIT Configuration Management Database (CMDB) as part of normal DoIT operations.
  • Directors should program the SNCC number into their phone with Bypass Do Not Disturb mode so that it makes noise even if the phone is on silent.

Testing, Training and Exercises

DoIT staff’s understanding and familiarity with COOP content and procedures creates more efficient preparedness, response, and execution for maintaining essential IT service operations. For that reason, the Plan is exercised annually either

  • during an unplanned interruption to IT services, or
  • via a planned TTX with a specific scenario which involves a major disruption to IT services.

In each TTX, the DoIT COOP is validated and evaluated for possible improvements in COOP activation, responsible party COOP preparation, and COOP operation. Improvements are reported in the After Action Report (AAR) and once approved by DoIT Management, are tracked in WiscIT via the WiscIT Continuous Improvement Registry (CIR). 

In conjunction with UWPD Emergency Management, the DoIT COOP Coordinator determines the annual TTX scheduling and strives for a time when DoIT leaders and staff can participate or delegate.

TTX scenarios are chosen and developed based on DoIT COOP, Appendix F - Threat, Risk, and Vulnerability Analysis.

Emergency management training

DoIT staff who have a designated role during COOP activation may consider formal training to familiarize key emergency disruption concepts and principles, such as the Incident Command System (ICS). FEMA offers several free online training courses.

III. COOP Roles and Responsibilities

Understanding COOP roles and responsibilities

The DoIT COOP Incident Commander (CIC) and continuity operations are guided, informed and executed by teams. Details on roles and teams are here.

A. DoIT COOP Incident Commander (DoIT CIC) role

DoIT COOP Incident Commander (DoIT CIC) role

DoIT executives with formal Order of Succession responsibilities for the DoIT COOP Incident Commander (DoIT CIC) role are:

    1. Primary: Deputy Chief Information Officer
    2. Secondary: Core Services Director as determined during activation by the EMT
    3. Tertiary: ITSM Associate Director

In the unlikely event that none of the deputy CIO or Core Services directors are available, DoIT CIC will transfer to the ITSM Associate Director, who will need to communicate directly with CIO Exec Mgmt Team. The ITSM Associate Director will hand DoIT CIC back when a CS Director or Deputy becomes available.

 DoIT CIC specific responsibilities include:

» Internal policy level decisions

» Coordination of communications with UW-Madison campus officials and other executive authorities

» Coordination of public information and media contacts

» Communication with Human Resources director to ensure employees are notified and provided with any necessary resources/assistance

      • DoIT facilities closures or relocation to alternate site(s)
      • Actions that may result in the loss of intellectual or proprietary capital
      • Fiscal authorizations

Delegation of Authority and Order of Succession for DoIT COOP Incident Commander
    • Handing off DoIT CIC

    • If the DoIT CIC role needs to be transferred to another (e.g. fatigue, conflict, etc.) 

      • New DoIT CIC is determined by EMT, using the same process as above

      • Former DoIT CIC communicates to CIO Exec Mgmt Team and new DoIT CIC of the status of the situation, next steps, and make explicit the hand-off of DoIT CIC role and expected duration of the transfer

    The emergency Delegations of Authority and formal Order of Succession are effective during disruptive emergency conditions when the DoIT Continuity of Operations Plan is actioned, and other emergency situations which disrupt normal IT service operations. Delegations of Authority and Order of Succession take effect when normal channels of direction are disrupted and terminate when these channels have resumed. To the extent circumstances permit, officials must document the beginning and end dates of their authority under this activation.

    Reference DoIT COOP Incident Commander Responsibilities - Detail

    B. DoIT Executive Management Team (DoIT EMT)

    The Executive Management Team (EMT)

    The Executive Management Team (EMT) and the DoIT COOP Incident Commander work closely to assess the situation, activate the plan, and implement operational continuity.

    Executive Management Team (EMT) Role - During emergency events, the EMT coordinates intercessions and decision-making according to processes, offers strategic guidance for their respective groups and informs and advises the DoIT COOP Incident Commander. 

    EMT members These DoIT Executives who comprise the Executive Management Team
    »      Communications Director
    »      Human Resources Director
    »      Chief Financial Officer (CFO)
    »      Chief Information Security Officer (CISO)
    »      IT Service Recovery Team Director
    »      Chief Technical Officer (CTO) as Campus Liaison
    »      Administrative Team Director
    »      Core Services Directors

    Administrative Team Director and IT Service Recovery Team Director report to the DoIT CIC directly, who in turn reports to the Executive Management Team. IT service restoration/recovery is prioritized by the pre-established Service Tier (COOP tier) order in Appendix D, and as appropriate to the emergency, with any necessary adjustments provided by the Executive Management Team.

    EMT responsibilities A principal responsibility for the EMT is to keep managers focused on the right set of priorities in emergency disruption conditions. EMT responsibilities include:
    »      Manage initial response
    »      Gather information and analyze conditions related to DoIT and throughout the University
    »      Allocate and direct distribution of resources to accomplish the purposes of the DoIT's COOP Plan
    »      Request needed resources from available outside sources if internal resources are not available
    »      Direct incident response teams
    »      Prioritize and resolve issues
    »      Assure communications with stakeholders
    »      Approve final plan and final policy decisions
     
    Incident Response Functions of the EMT include:
    »      Coordinate response with campus emergency management procedures
    »      Initiate measures for the safety of lives and property
    »      Establish a physical or virtual emergency operations center
    »      Initiate damage assessment
    »      Determine extent that the DoIT Plan will be used and declare a full or partial COOP activation if warranted
    »      Address issues escalated through Incident Response team leads
    »      Report status to University administration

    Reference: Executive Management Team details

    C. Incident Response team 

    Incident Response team

    The Incident Response team is headed by the DoIT CIC and includes personnel representing areas of DoIT that have critical COOP execution responsibilities. Administrative and IT service recovery teams report directly to the DoIT CIC, who reports to the Executive Management Team. Services will be recovered as rapidly as possible according to pre-established priorities, with adjustments made by EMT as necessary for the situation.

    Incident Response teams are responsible for the execution of the COOP Plan during an emergency disruption. Incident Response team members are expected to gather the required resources needed to implement IT service continuation and recovery efforts. Incident Response teams are activated to respond to any emergency situation at a level based on the type and nature of the disruption.

    The Incident Response team is comprised of three teams bridging different functional areas

             1.Executive Management Team (EMT)

      2. IT Service Recovery team

      IT Service Recovery Team Director role is filled by

      • Primary: SEO Director
      • Secondary: DoIT Core Services Director as determined by situation

      IT Service Recovery team members include

      • IT Service Recovery Team Director
      • Application Services team
      • Facility team
      • Hardware team
      • Help Desk team
      • Infrastructure team
      • Network team
      • Operations team
      • Cybersecurity team
      • Physical Security team

      IT Service Recovery team resources

      DoIT maintains offsite on BOX several lists, extractions from WiscIT, and other resources for IT Service Recovery efforts

           3. Administrative team

      • Administrative Team Director
      • Finance team
      • Insurance team
      • Logistics team
      • Procurement team 

      D. Communications team

      Communications team

      • Communications team director is

        • Primary: DoIT Communications Director
        • Secondary: Asst Communications Director
      • Responsibilities include

        • Provide primary communications link between DoIT and customers
        • Liaise with UW Communications
        • Provide media relations
        • Prepare messages to be disseminated by Help Desk team
      • Additional Information: Communications checklist

      E. Human Resources team

      Human Resources team 

      • Human Resource team director is
        • Primary: DoIT HR Director
        • Secondary: UW HR Director
      • Responsibilities include

        • Support employees and families
        • Coordinate with University Legal Services and University Benefits Services
        • Expedite hiring of temporary staff as needed
      • Additional Information: Human Resources checklist

      F. Administrative team

      Administrative team

      The Administrative team directs the administrative, logistical, and financial aspects of IT service recovery. They are directed by the Administrative Team Director and are comprised of four teams:

      • The Administrative Team Director role is
        • Primary: DoIT Chief Financial Officer
        • Secondary: DoIT Accounting Manager
      • Responsibilities include

        • Support employees and families
        • Coordinate with University Legal Services and University Benefits Services
        • Expedite hiring of temporary staff as needed
      • Additional Information: Administrative Team Director checklist

        •  Administrative team members
            • Administrative Team Director
            • Finance team
            • Insurance team
            • Logistics team
            • Procurement team

      G. Assigned and Unassigned Personnel Responsibilities

      COOP-assigned personnel's responsibilities

      Any staff assigned COOP responsibilities is on-site during a COOP activation event should first be concerned for their own safety. If they are on site and safe, COOP-assigned staff are to perform responsibilities as delegated by primary managers, once COOP has been activated. Following COOP activation, assigned staff members should report to the emergency operations location as determined by the Executive Management Team.

      COOP-assigned staff will assist as requested in activities such as the handling of all essential services, the notification of all staff members regarding the situation, and the contact of any unassigned staff who are requested to provide assistance. They will be directly responsible for utilizing the COOP roles and responsibilities documentation, including DoIT personnel, unit mappings to the COOP structure as well as the supporting DoIT organizational chart and telephone list. They are also directly responsible for providing assistance as directed by higher-ranking staff members or alternates as designated by the Plan.

      Unassigned personnel's responsibilities

      Unassigned personnel should be prepared to support the assigned staff, if required. During non-duty hours they should remain at home and check for information or instructions every morning by contacting the DoIT Human Resources team or through other designated communication channels. If they are called in to work, staff should report to their designated location and perform any assigned duties that are appropriate for their skills and training.

       

        IV. Appendices

        These Appendices contain COOP information requiring more frequent review and updates.

        A. Orders of succession

        Orders of Succession

        Orders of succession are formal, sequential listings of positions (rather than specific names of individuals) that identify who is authorized to assume a particular leadership or management role when the incumbent dies, resigns, is unavailable, debilitated, or is otherwise unable to perform the functions and duties of his or her position. Orders of succession provide for the orderly and predefined assumption of offices during an emergency situation requiring COOP activation. They allow for the continued operation of DoIT and its essential services and enable a rapid response.

        In the absence or unavailability of the Deputy CIO,  the following order of succession will determine fulfillment of the DoIT COOP Incident Commander (CIC) role :

        • Secondary: a DoIT Core Services director as determined by EMT
        • Tertiary: DoIT ITSM assistant director

        The DoIT COOP Incident Commander (CIC) or alternative will contact the other positions in the order listed above until he/she reaches a person that is available to serve as the Deputy CIO.

        The DoIT CIC has the authority to re-delegate the associated functions and activities of the role to the next in the line of succession if the successor is better equipped to serve as the DoIT CIC based on the major disruption’s nature. If the Deputy CIO is unable to serve as the DoIT CIC, the successor has the full authority that the Deputy CIO would have, which includes carrying out the functions of DoIT and the ability to allocate the entire Division's fiscal, personnel, and equipment resources.

        The Deputy CIO reserves the right to place limitations on the successor's authority. They are as follows:

        • The Deputy CIO places no limitations.

        Once the DoIT CIC appoints another successor, the Deputy CIO is able to return to his/her position, or the CIO and Vice Provost for Information Technology assigns another successor, all authorities previously delegated will be terminated. Table 7 outlines other DoIT Order of Succession.

         


        Table 7: Orders of Succession

        Area

        Position Title

        Name

        Successor(s)

        Administration

        Chief information Officer

        Lois Brooks

        David Pagenkopf

        Administration

        Chief Financial Officer

        Sara Hart McGuinnis

        Colleen Reilly

        Administration

        Deputy Chief Executive Officer

        David Pagenkopf

        Advancement

        Communications Director

        Mary Evansen

        Kyle Henderson

        Human Resources

        Human Resources Director

        Adam Fermanich

        Holly Weber

        B. Delegation of authority

        Delegation of authority

        This Delegation of Authority ensure the orderly and predetermined transition of responsibilities. They are related to but distinct from orders of succession. Written delegation of authority provides recipients with legal authorization to act on behalf of the organizational officials and to execute specific duties within the organization. Delegations of authority are triggered when the position holding authority is not readily accessible due to travel, communications outages, sickness, or is otherwise unable to fulfill their responsibilities. In some cases, limitations such as financial restrictions may be applied. 

        This document provides the legal authority for officials to make key policy decisions during a COOP activation.

        In the event of a major emergency disruption of IT services, the primary and alternate Emergency Management Team members and the primary and alternate Incident Response team members listed are delegated to have the necessary authority to carry out their essential services. This delegation of authority ensures:

        • Continued operations of the Division and its critical services
        • Rapid response to any emergency situation requiring COOP implementation

        These predetermined delegations of authority will take effect when normal channels of direction are disrupted, and will terminate when normal channels are resumed.

         

        Delegation of authority - Deputy CIO

        The successor to the Deputy CIO (as determined by the Orders of Succession listed above) has the full authority that the Deputy CIO would have, which includes carrying out the functions of DoIT and the ability to allocate the entire Division's fiscal, personnel and equipment resources.

        The Deputy CIO reserves the right to place limitations on the successor relating to Division expenditures.

        In the event that the Deputy CIO or other key personnel are unavailable to serve as the Deputy CIO, the order of succession specified in the section above will be adhered to until a higher successor becomes available. At this point, all of the authorities previously delegated will be terminated.

        If the successor is expected to become unavailable or someone else in the line of succession is better equipped to serve as the Deputy CIO based on the nature of the major problem, the successor has the authority to re-delegate the functions and activities associated with being the Deputy CIO to that person.  Table 8 summarizes key authorities within DoIT

        Table 8: Delegation of Authority

        Authority

        Position Holding Authority

        Acting Agent

        Limitations

        Human Resources Policy Changes

        Human Resources Director

        Adam Fermanich

        Not specified

        Purchase Requisitions / Spending Authority

        Chief Financial Officer

        Sarah Hart McGuinnis

        Purchasing department rules, including use, amount limitation, etc.

        Executive of Contractual Agreements

        Chief Financial Officer

        Sarah Hart McGuinnis

        Standard operational contracting procedures

        Legal Counsel

        UW-Madison Office of Legal Affairs, Vice Chancellor for Legal Affairs

         Nancy K. Lynch

        Not specified

        Travel Authorizations

        DoIT Purchasing & Financial Services

        Kirsten Mastalir

        Established travel restrictions and costs

        Leave Authorizations

        DoIT Directors

         

        Standard contractual limitations

        Communication with UW-Madison Campus Officials and Executive Authorities

        Director of Communications

        Mary Evansen

        Not specified

        C. Orders of Notification

        Orders of notification

        Initial notification procedures are critical to personnel safety and orderly response to a major disruption. 

        The Senior Systems and Network Control Center (SNCC) staff on duty, in consultation with the Systems Engineering & Operations (SEO) Duty Manager and the SEO Director, are responsible for following data center notification procedures as described above in Section 1.1.1 of this document.

        Escalation for COOP activation

        1. SNCC staff notify
          • appropriate technologists of disruption
          • Situation Manager and leads of affected IT services
          • SEO Duty Manager of disruption
        2. SEO Duty Manager
          • Engages UWPD if not already informed
          • Engages relevant SEO Oncall staff
          • Engages the SEO director
          • Notifies the Building Manager, if appropriate
        3. SEO Director
          • Notifies Executive Management Team (EMT) of possible need for COOP activation
        4. Executive Management Team (EMT)
          • Determine COOP activation level – if the DoIT COOP activation is warranted, and if so, the extent of COOP to be used by declaring a full or partial COOP activation

        If an immediate danger exists within a DoIT facility, a member of the Incident Management team will immediately notify the Building Manager that an incident has occurred and action is being taken to respond to the incident. Building occupants should be directed according to hazard specific procedures outlined in the UW-Madison Police Emergency Procedures Guide. Note that certain incidents (e.g. a fire or active shooter event) may require immediate response/direction.

        Staff notification

        During COOP activation, DoIT personnel who can assist in resolving a major disruption should be notified via normal Escalation procedures, as per the DoIT Operational Framework Section 4 – Incident Management, section 4.4: Guidance for Escalating Incidents to Management. See extracts below for escalation guidelines.

        SNCC operators will post notifications to the Outage pages, if available and applicable to the disruption, and involve appropriate technologists, the Situation Manager and SEO Duty Manager.

        UW executive notifications

        During COOP activation, in conjunction with DoIT Communications, the DoIT CIC approves all communication with UW-Madison campus officials and other executive authorities

        Public communication

        DoIT Communications coordinates UW-Madison Communications for all general public communication via websites, email, and social media about a major disruption.

         

        Extracts from DoIT operational framework, section 4.

        Table 4-2: Guidelines for escalation to management

        Elapsed time

        Priority One incident

        10 minutes

        Systems & Network Control Center Operators handle the technical escalation appropriately: to Network Operational Engineering (OpEng) if a network component is involved or to the relevant support technologist. If the service is not supported after hours or there are no after-hours technologists available, the Operators escalate the incident directly to the Situation Manager.

        2 hours

        SNCC staff ensures that the Situation Manager for the primary service experiencing the outage and situation managers for all dependent services are contacted. For example, an LDAP authentication outage will cause a WiscMail Impact one outage. The WiscMail Situation Manager should be contacted along with the LDAP Situation Manager. Refer to Identify the Situation Manager in the Situation Manager Standards (Section 4.7) portion of the Operational Framework for more information about contacting the appropriate Situation Manager.

             …(see KB 11040 for details)…

         

        4.5.1. Priority One case escalation

          • If a network component is not involved in the service outage, SNCC staff should contact the appropriate service support technologist by using contact information defined in the Configuration Management Database (CMDB)
          • If a network component is involved in the service outage and SNCC staff cannot contact Op Engineering, contact appropriate network engineer and the Op Engineering manager
          • Priority 1 case escalation always requires direct interactive contact (voice, chat, email with reply) with the area to which the case is being escalated. Do not assume that others can work on a case until you confirm it interactively

        D. Service tiers (COOP tiers)

        Service tiers (COOP tiers)

        The DoIT Service Tier system, often called the DoIT COOP Tier system, is a method to categorize services managed by or hosted in DoIT data centers which informs service recovery priorities. DoIT leadership and governance groups are responsible for identifying and evaluating DoIT service recovery priorities.

        Individual DoIT service providers are responsible for identifying needed capability improvements in consultation with their customers, and for requesting funding to support those improvements. DoIT leadership and governance groups are responsible for evaluating proposed improvements and for establishing appropriate prioritization, planning, and funding mechanisms to implement approved improvements.

        Service tier definitions

        The following service tier definitions are based on best practices established by the National Infrastructure Protection Plan (NIPP) and other guidelines, and are supported by the State of Wisconsin, University of Wisconsin System, and University of Wisconsin-Madison.

        Table 9 - Service tier definitions

        Tier

        Impact

        Definition

        0

        »      Foundational

        »      Critical Infrastructure

        »      Infrastructure required for the operation of essential and other infrastructure services

        1

        »      Health

        »      Safety

        »      Law and Order

        »      Services whose loss endangers health, safety, or orderly response to campus operations.

        »      Includes essential, customer-facing services (1A) whose loss for >8 business hours represents a significant adverse impact.

        2

        »      Enterprise Operations

        »      Severe Impact

        »      University-wide services whose loss severely impacts campus operations.

        »      Departmental services whose loss prevents a specific department from operating.

        3

        »      Enterprise Operations

        »      Departmental Operations

        »      Moderate Impact

        »      University-wide services whose loss affects campus operations.

        »      Departmental services whose loss severely affects a specific department's operations.

        4

        »      Departmental Operations

        »      Minimal Impact

        »      University-wide services whose loss has minimal impact on campus operations.

        »      Departmental services whose loss affects a specific department's operations

        E. Campus emergency contacts

        Campus emergency contacts

        • Report all injuries and damage to the University Police UWPD by calling 911.
          • Be prepared to give the following:

            • Your name
            • Building Name
            • Type of injury or damage
            • The location of injured person(s) or building damage
            • Room number you are calling from
        • When reporting an emergency
          • Stay on the line with the dispatcher
          • Provide the address, location and a description of the emergency
          • Provide the phone number at your location
          • Provide a thorough description of the incident to assure appropriate resources are dispatched
          • Building issues - first call UWPD
            • For building related problems, you must first call the UW Police Department at 264-COPS (2677) and they will contact the University of Wisconsin Physical Plant.
        • Campus and community contact telephones - Table 10
          • Table 10: Emergency Campus and Community Telephone List

            Entity/Organization

            Telephone Number

            Fire/Police/Ambulance

            9-1-1

            University of Wisconsin Police Department Non-Emergency

            608-264-COPS (2677)

            Dane County Public Health Department

            608-266-4225

            608-255-2345

            University of Wisconsin Physical Plant Customer Service – Tradesmen

            608-263-3333

            University of Wisconsin Safety Department    

            608-265-5000

            Madison Gas & Electric

            608-251-8300

            University of Wisconsin Environment, Health and Safety Department for follow-up after biological, chemical, or radioactive hazards

            608-265-5600

            University of Wisconsin Health Services (UHS)

            608-265-5600

            Poison Control

            800-222-1222

            National Weather Service Milwaukee/Sullivan Office

            262-965-2074

            Dane County 24-hour Mental Health Crisis Line

            608-280-2600

            Physical Plant Customer Service

            608-263-3333

            DoIT Building Facility Manager

            608-262-6149 / c 608-517-6732

           

        F. Threat, risk, and vulnerability analysis

        Threat, risk, and vulnerability analysis

        This section has 3 parts.

        Sections 2 and 3 are provided by Ed Lawson of UWPD in UW-Madison Hazard/Consequence Analysis, Aug 3, 2021.

        • 1. Disruption and response
          • Many IT service disruptions are managed during the ordinary course of business by the DoIT Problem Management framework and accompanying procedures. Typical outages use DoIT's Help Desk procedures for creating incident and problem tickets/reports, generating notification procedures, and escalating severity levels as necessary.

            IT Service Problems which potentially jeopardize staff safety, physical access to or structural integrity of the data center facility, physical integrity of equipment, or major power outages of extended duration may constitute a need to activate the Plan.

            DoIT management may escalate or de-escalate the response of any disruption depending on evolving circumstances.

             

            Table 11: Example Emergency Disruptions and COOP Activation Levels

            Problem Type

            Examples

            Severity
            danger/damage/duration

            Response

            Facility damage

            Fire

            Major Disruption, resulting in:

            »      Extensive physical damage and/or danger

            »      Extensive duration

            »      Full DoIT Continuity of Operations plan activation

            »      Emergency operations center activation

            Flooding

            Explosion

            Facility access

            Terrorist threat

            Major Disruption, resulting in:

            »      Potentially extensive damage and/or danger

            »      Medium duration

            »      Campus Emergency Management activation

            »      Full DoIT Continuity of Operations plan alert

            »      Emergency communications activation

            Severe weather

            Train derailment

            Pandemic

            Power outage

            Utility outage

            Moderate Disruption, resulting in:

            »      Minimal physical damage or danger

            »      Medium duration

            »      DoIT Continuity of Operations plan standby alert

            »      Possible Full or Partial Activation of DoIT Continuity of Operations plan

            »      Emergency communications activation

            Hardware or
            software attack

            Computer virus

            Minor Disruption, resulting in:

            »      Potentially extensive loss of services

            »      Medium duration

            »      DoIT Problem Management procedures

            »      Possible Partial Activation of DoIT Continuity of Operations plan

            Computer hacker

            Hardware or
            software error

            Human error

            Minor Disruption, resulting in:

            »      Potentially extensive loss of services

            »      Medium duration

            »      DoIT Problem Management procedures

            Software malfunction

            Hardware malfunction

             

             

        • 2. Hazard chart with risk scores
          • Below is a list of hazards of concern to UW-Madison campus departments known to cause emergencies and disasters on university campuses. The hazards have basic risk scores. These hazards are compiled based on:  

            • City of Madison Hazard Assessment
            • UW-Madison Emergency Management campus concerns development and feedback for campus departments
            • Definitions and information from Ready.gov website and other government sources
          • Other hazards -- not listed below – may also cause emergency situations on campus. The EOP may be activated with or without an Emergency Operations Center (EOC) for any of the following situations.
          • The risk score is a function of likelihood and consequence. Likelihood is the chance that the hazard might occur. Since the risk of any hazard is dependent upon the chance that it will occur (likelihood), and the impact of an occurrence (consequence). Risk Score = Likelihood x Consequence

        Table 12: Risk Scores

        Hazard 

        Likelihood  

        Consequence 

        Risk Score 

        Active Threat/Public Places 

        Rare 

        Major 

        Low 

        Civil Disorder / Mass Protest 

        Moderate  

        Moderate 

        Low 

        Extreme Cold 

        Moderate 

        Moderate 

         Low 

        Extreme Heat 

        Moderate 

        Moderate 

        Low 

        Explosion 

        Rare 

        Major 

        Low 

        Facility Failure 

        Moderate 

        Moderate 

        Low 

        Flood/Rain Event 

        Moderate 

        Major 

        Medium 

        Fire 

        Rare 

        Moderate 

        Low 

        Health Emergency/Epidemic/Pandemic 

        Rare 

        Moderate 

        Low 

        Hazmat Event 

        Moderate 

        Major 

        Medium 

        IT Outage 

        Moderate 

        Moderate 

        Low  

        Lightning 

        Rare 

        Major 

        Low 

        Mass Casualty 

        Rare 

        Major 

        Low 

        Power Outage 

        Moderate 

        Moderate 

        Low 

        Supply Interruption 

        Rare 

        Moderate 

        Low 

        Tornado 

        Rare 

        Major 

        Low 

        Chart terms 

        Likelihood – The chance that the hazard might occur. Likelihood options for this chart are:

        • Very high – Can be expected to occur, 75% or greater chance of occurring.   
        • High – Will probably occur, 50 %– 75% chance of occurring. 
        • Moderate - May happen, 25% - 50% chance of occurring. 
        • Rare - May only occur in exceptional circumstances. 

        Reference: https://ehealthresearch.no/files/documents/Appendix-Definitions.pdf 

          • Consequence – The impact of an occurrence. Consequence options for this chart are:

            • Extreme – Death and life-threatening injuries, outside resources required, extreme operational disruption and widespread property damage. 
            • Major – Life threatening injuries, institutional resources required, major operational disruptions and severe property damage. 
            • Moderate – Moderate to life impacting injuries, additional resources required, significant delays in work performance and substantial property damage. 
            • Minor – Minor injuries, moderate impact on resources, modest delays and moderate property damage. 
            • Insignificant – No injuries, no impact on resources, no delays in work performance and minor property damage. 

            Reference: https://institute.acs.org/lab-safety/hazard-assessment/fundamentals/risk-assessment.html 


            Table 13: Risk score – likelihood x consequence 

             

            CONSEQUENCE 

            LIKELIHOOD 

            Insignificant (1) 

            Minor (2) 

            Moderate (3) 

            Major (4) 

            Extreme (5) 

            Rare (1) 

            Low

            Low

            Low

            Low

            Low

            Moderate (2) 

            Low

            Low

            Low

            Medium

            Medium

            High (3) 

            Low

            Low

            Medium

            Medium

            Medium

            Very High (4) 

            Low

            Medium

            Medium

            High

            High

            Reference: www.healthcaregovernance.org.au/docs/risk-matrix.doc  

             

        • 3. UWPD definitions
          • Active threat/attacks in public places - Active shooter: Individuals using firearms to cause mass casualties. Individuals using a vehicle to cause mass casualties. Individuals using homemade bombs to cause mass casualties. Other methods of mass attacks may include knives, fires, drones or other weapons, civil disturbances, bomb threats, building takeovers, hostage situations, workplace violence, acts of terrorism, physical violence against self or others. 

            Civil disorder/mass protest - Civil disorder, also known as civil disturbance or civil unrest, is an activity arising from a mass act of civil disobedience (such as a demonstration, riot, or strike) in which the participants become hostile toward authority, and authorities incur difficulties in maintaining public safety and order, over the disorderly crowd. 

            Explosions - Explosive devices can be carried by cars and people and are easily detonated from remote locations or by suicide bombers.  Explosions can also occur by accident, gas line leaks, boiler explosions, unmonitored research projects, etc. 

            Extreme cold - Exposures to extreme cold can cause frostbite and/or hypothermia and can become life threatening. When the effects of extreme cold are combined with wind, a person’s body can lose heat quickly. The wind chill temperature index is based on the rate of heat loss from exposed skin by combined effects of wind and cold. As the wind increases, heat is carried away from the body at an accelerated rate, driving down the body temperature. When the wind chill is lower than -20°, frostbite can occur in approximately 30 minutes. As the wind chill decreases, frostbite can set in more quickly. About 50% of extreme cold injuries happen to people over 60 years old. More than 75% of injuries happen to males. 

            Extreme heat - Extreme heat is a period of high heat and humidity with heat indices above 90°. Extreme heat causes your body to work extra hard to maintain a normal temperature, which can first lead to heat exhaustion, then to heat stroke, and eventually can lead to death without intervention. In fact, extreme heat is responsible for the highest number of annual deaths among all weather-related hazards. Heat indices over 105° are dangerous and over 125° are extremely dangerous. 

            Facility failure - Elevator malfunction, utility service interruption (gas, water, electricity, sewer, heat). Building collapse, bleacher collapse, window breakage, damage to buildings.  

            Fire - Fire is fast and can become a major fire in less than 30 seconds. It only takes minutes for thick black smoke to fill a house or for it to be engulfed in flames. The heat is more threatening than flames. Room temperatures in a fire can be 100 degrees at floor level and rise to 600 degrees at eye level. Inhaling this super-hot air will scorch your lungs and melt clothes to your skin.  It starts bright, but quickly produces black smoke and complete darkness.  Smoke and toxic gases kill more people than flames do. Fire produces poisonous gases that make you disoriented and drowsy. 

            Flood/rain event - Flooding is a temporary overflow of water onto normally dry land. Floods are the most common natural disaster in the United States. Failing to evacuate flooded areas or entering flood waters can lead to injury or death. Six inches of flood water will reach the bottoms of most cars, causing the loss of control of the vehicle or possibly stalling. A foot of water will float a car. Two feet of water can carry away most vehicles. Floodings may result from rain, snow, coastal storms, storm surges and overflows of dams and other water systems. Flood can develop slowly or quickly. Flash floods can come with no warning, cause outages, disrupt transportation, damage buildings and create landslides. 

            Hazardous material incidents - Hazardous materials can include explosives, flammable and combustible substances, poisons and radioactive materials. Emergencies can happen during research, production, storage, transportation, use or disposal. You are at risk when chemicals are used unsafely or released in harmful amounts where you live, work or play. 

            Health emergency/epidemic/pandemic There will be times where health emergencies are limited to countries. In these cases, they are identified as epidemics. A pandemic is a disease outbreak that spans several countries and affects a large number of people. Pandemics are most often caused by viruses that can easily spread from person to person. It is hard to predict when or where the next new pandemic will emerge.   

            IT outage – Damage or denial of service to communication, radio, television, computer or other University, or affiliated technologies; cyber invasion. 

            Mass casualty - Refers to any large number of casualties produced in a relatively short period of time, usually as the result of a single incident such as an aircraft accident, hurricane, flood, earthquake, or a violent attack that exceeds local logistic support capabilities. 

            Power outages - A power outage is when the electrical power goes out unexpectedly. A power outage may: Disrupt communications, water and transportation, close retail businesses, grocery stores, gas stations, ATMs, banks and other services, cause food spoilage and water contamination, and prevent use of medical devices. Extended power outages may impact the whole community and the economy. 

            Severe winter snow/ice storm – Severe winter storms produce excessive amounts of winter precipitation (i.e., snow, freezing rain, sleet) that cause dangerous impacts. Winter storms create a higher risk of car accidents, hypothermia, frostbite, carbon monoxide poisoning, and heart attacks from overexertion. Winter storms including blizzards can bring extreme cold, freezing rain, snow, ice and high winds. 

            Thunderstorms/tornadoes/lightning - Severe thunderstorms are dangerous storms that include lightning and can produce extreme winds, tornadoes, and flash flooding. A severe thunderstorm occurs when wind speeds from the thunderstorm exceed 58 mph. In Wisconsin, this is the most common type of severe weather from a thunderstorm. Thunderstorms can also produce tornadoes. Tornadoes are violently rotating columns of air that extend from a thunderstorm to the ground. Tornadoes can destroy buildings, flip cars, and create deadly flying debris. A tornado can happen anytime and anywhere, bring intense winds (over 200 miles per hour). Severe lightning can also accompany thunderstorms. Lightning is a leading cause of injury and death from weather-related hazards. Although most lightning victims survive, people struck by lightning often report a variety of long-term, debilitating symptoms. 

        G. DoIT locations

        DoIT locations

        DoIT locations include office spaces and data center spaces.

        Office spaces

        »      Computer Sciences building, 1210 West Dayton Street
        »      Rust-Schreiner building, 111 N. Orchard St. and 115 N. Orchard St.
        »      Middleton Building, 1305 Linden St,
        »      634 W Main St for ESB staff 
        »      660 West Washington Ave
        »      700 Regent St leased office space
        »      2109 S. Stoughton Road, Digital Printing and Publishing Services (DPPS) facility
        »      WARF offices, formerly AIMS

        Data Center and Equipment spaces

        »      One Neck IT Solutions, ‎5515 Nobel Dr, Fitchburg, WI 53711 leased space,
        »      Medical Foundation Centennial Building (MFCB), 1685 Highland Dr
        »      Wisconsin Alumni Research Foundation building, 610 Walnut St (end of use expected after Feb 2024)
        »      DOA-DET leased space, 5830 Femrite Dr, Madison 53718
        »      Materials Distribution Services (MDS), 1061 Thousand Oaks Trail, Verona 
        »      Network nodes at 14 locations and Network supernodes at three locations

        • 1210 West Dayton St, Computer Science & Statistics building
        • 432 North Murray St - East Campus Mall
        • 1675 Observatory Dr, Animal Sciences building

        Building Maps
        Maps and floor plans are located in each building’s Occupant Emergency Plan (OEP) which is stored in two locations, by
        »      the Building Manager, and 
        »      UW Police Department Emergency Management, in Smartsheet

        H. Critical websites

        Critical websites

        This list of critical websites is also located in the Box folder entitled “wiscit”.

         

        Critical and important websites - COVID

        Last updated 3/18/2020 CONFIDENTIAL

        Table 14: Critical and important websites

        NOTE: [per the DoIT Web Platform Services team] we strongly recommend all critical information be shared on the sites marked at "Critical Infrastructure - priority 1." These sites are architected to be resilient to high volumes of traffic. Other sites will receive second or third priority for staff focus and may have more risk of outages.

        Name

        Category

        Category defined by

        Who manages

        (site admin / owner)

        Where hosted

        Action Items

        Recommend-

        ations/notes

        alerts.wisc.edu

        Critical Infrastructure - priority 1

        UMark/UComm

        University Marketing

        AWS UMark Account,

        high-availability

        No Action Needed

        https://kb.wisc.edu/helpdesk/internal/page.php?id=44007

        Admin for alerts.wisc.edu

        Critical Infrastructure - priority 1

        UMark/UComm

        University Marketing

        AWS UMark Account,

        high-availability

        No Action Needed

        https://kb.wisc.edu/helpdesk/internal/page.php?id=44007

        Canvas

        Critical Infrastructure - priority 1

        UMark/UComm

        Academic Technology

         

         

         

        covid19.wisc.edu

        Critical Infrastructure - priority 1

        UMark/UComm

        University Marketing

        AWS UMark Account,

        high-availability

        No Action Needed

        https://kb.wisc.edu/helpdesk/internal/page.php?id=44007

        instructionalcont

        inuity.wisc.edu

        Critical Infrastructure - priority 1

        UMark/UComm

        Academic Technology

        WiscWeb

        WPS moving to HA

        This is not on a high availability server. Initial site launched on WiscWeb to be available for Chancellor announcement. Full WPS staff is available to mitigate any potential issues as needed. WPS, AT, and UMark now establishing plans to move to a high-availability server.

        kb.wisc.edu

        Critical Infrastructure - priority 1

        DoIT Comm

        Web

        Platforms/Services

        KB high-availability servers

        No Action Needed

         

        uhs.wisc.edu/

        coronavirus-2019

        Critical Infrastructure - priority 1

        UMark/UComm

        University Marketing

        REDIRECT in place to covid19.

        wisc.edu

        No Action Needed

        UHS site content has been added to main COVID site and redirect in place.

        wisc.edu

        Critical Infrastructure - priority 1

        UMark/UComm

        University Marketing

        AWS UMark Account,

        high-availability

        No Action Needed

        https://kb.wisc.edu/helpdesk/internal/page.php?id=44007

        doitnet.doit.wisc.edu

        Non-critical; high visibility - priority 2

         

        DoIT Comm

        Web Hosting

        Check with owner

        This is not on a high-availability server. Consult with application admins about service expectations

        housing.wisc.edu

        Non-critical; high visibility - priority 2

        UMark/UComm

        University Housing

        WiscWeb

        Check with owner

        This is not on a high-availability server. Consult with application admins about service expectations

        it.wisc.edu

        Non-critical; high visibility - priority 2

         

        DoIT Comm

        Web Hosting

        Check with owner

        This is not on a high-availability server. Consult with application admins about service expectations

        my.wisc.edu

        Non-critical; high visibility - priority 2

        UMark/UComm

        Web

        Platforms/Services

        MyUW high-availability servers

        No Action Needed

         

        outages.doit.wisc.edu

        Non-critical; high visibility - priority 2

        DoIT Comm

        DoIT Service Management

        Dedicated VM (but not High Availability)

        Check with owner

        This is on a dedicated VM, but is not on a highavailability server. Consult with application admins

        about service expectations

        parent.wisc.edu

        Non-critical; high visibility - priority 2

        UMark/UComm

        University Marketing

        / Campus & Visitor

        Relations

        UMark single server:

        HENRY

        No Action Needed

        May need to contact Umark to find out if they were planning on makng this HA

        registrar.wisc.edu

        Non-critical; high visibility - priority 2

        UMark/UComm

        Registrar's Office

        WiscWeb

        Check with owner

        This is not on a high-availability server. Consult with application admins about service expectations

        uhs.wisc.edu

        Non-critical; high visibility - priority 2

        UMark/UComm

        University Marketing / UHS

        Web Hosting

        Check with owner

        This is not on a high-availability server. Consult with application admins about service expectations

        uwpd.wisc.edu

        Non-critical; high visibility - priority 2

        UMark/UComm

        University Marketing

        / UWPD

        Web Hosting

        Check with owner

        This is not on a high-availability server. Consult with application admins about service expectations

        admissions.wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - spring admissions

         

        WiscWeb

        Monitor - check with owner if concerns emerge

         

        advising.wisc.edu

        Non-critical; high visibility - priority 3

        Sent in TechNews

         

        Web Hosting

        Monitor - check with owner if concerns emerge

         

        apps.admissions.wisc.

        edu/visitbuck

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

        Web Hosting

        Monitor - check with owner if concerns emerge

         

        asm.wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

         

        Monitor - check with owner if concerns emerge

         

        businessservices.

        wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

        WiscWeb

        Monitor - check with owner if concerns emerge

         

        compliance.wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

        WiscWeb

        Monitor - check with owner if concerns emerge

         

        doso.students.

        wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

        WiscWeb

        Monitor - check with owner if concerns emerge

         

        ehs.wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

        WiscWeb

        Monitor - check with owner if concerns emerge

         

        hr.wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

        Web Hosting

        Monitor - check with owner if concerns emerge

         

        internationaltravel.

        wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

        WiscWeb

        Monitor - check with owner if concerns emerge

         

        iss.wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

        WiscWeb

        Monitor - check with owner if concerns emerge

         

        mynetid.wisc.edu

        /activate

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - spring admissions

        DoIT IAM

        DoIT - team unspecified

        Monitor - check with owner if concerns emerge

        Required step for applicants and accepted admits

        ohr.wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

        Web Hosting

        Monitor - check with owner if concerns emerge

         

        studyabroad.

        wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

        WiscWeb

        Monitor - check with owner if concerns emerge

         

        union.wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

        Web Hosting

        Monitor - check with owner if concerns emerge

         

        vote.wisc.edu

        Non-critical; high visibility - priority 3

        WPS proactive monitoring - linked from covid19.wisc.edu

         

        DoIT - Linux team

        Monitor - check with owner if concerns emerge

         

        V. Checklists and record templates

        Checklists referenced in the DoIT Continuity of Operations (COOP) Plan:

        Record templates

        A. COOP After-Action worksheet

        COOP After-Action worksheet

        Printable PDF Version

         

        Use this worksheet as a guide for completing the After-Action Report. Modify or add steps as the situation warrants.

        • Record date, time and person to whom each task or decision has been assigned.
        • If applicable, decide on a Recovery Time Objective (RTO).
        • Show all the steps that are actually taken during this process.
        • Record when each task is completed. Return the worksheet to the CIC or designee when the After-Action Report has been completed. It will be used to revise and update this worksheet.

        COOP After-Action worksheet

         

        LEAD: ___________________________
        DATE: ___________________________
        TIME: ___________________________

         Assigned

          Tasks / Decisions

         Completed

         

        Assemble group to review effectiveness of COOP plan and operations

        • What did we do well?
        • What can we do better?

        What feedback did we get from people and agencies we worked with?

         

         

        Distribute AAR to Executive Management Team

         

         

        If needed, schedule training for staff

         

         

        Revise COOP Manual to incorporate changes

         

         

        Write After-Action Report (AAR) of COOP activation

        • Event synopsis
        • Duration of activation
        • Goals and objectives

        Decisions and actions taken by CIC

         

        B. COOP deactivation checklist

        COOP deactivation process checklist

        Printable PDF Version

        In coordination with the Emergency Management Team, the Incident Commander will determine when to deactivate the COOP plan.

        Date/Time

        Decisions/Tasks

        Assigned to

         

        Develop a communications plan to inform all appropriate parties of the COOP deactivation.

        »     Assigned personnel: Inform them that their responsibilities in the COOP emergency have ended. Tell them where to report for their next on-duty assignment.

        »     Non-assigned personnel: Notify them that the emergency no longer exists. Provide instructions for resumption of normal operations.

        »     Vice Chancellor

        »     Building Manager or contact of alternate space being used

        »     Building Manager or contact of restored site or new primary site

        »     IT

        »     University Communications

        »     Other

         

         

        Inform the reconstitution manager that the department will be returning to normal operations. The reconstitution manager will work with the reconstitution team to facilitate the process.

         

         

        If necessary, assign a relocation team to begin the process of moving from the alternate site to the permanent facility.

         

        C. COOP log of contacts made

        COOP log of contacts made  

        Printable PDF Version

         

        Log of contacts made worksheet

        Log of contacts made

        Group = ____________ 

        Name

        Reason for call (notification, update work schedule, etc.)

        Contact Method (phone #, email address, etc.)

        Date/Time Contacted

        Date/Time Call Returned

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

        D. Reconstitution process log

        Reconstitution process log

        Printable PDF Version

        Use this log to track tasks performed during the reconstitution of DoIT Staff and facilities, after the initial response to a severe incident.

        Reconstitution process log

        Date/Time

        Tasks

        Recovery Time Objective (RTO)

        Assigned to

         

        Appoint a reconstitution team.

         

         

         

        Reconstitution team develops a detailed plan to ensure the orderly return move to the normal operating facility or to another operating facility.

         

         

         

        Assign a team to handle final preparations at the primary site. This team should develop a checklist of areas to be inspected and verified before the move (i.e., space configurations, proper functioning of equipment and PCs, heat/cooling, electricity, telephone and computer connectivity, etc.)

         

         

         

        Conduct an assessment to determine if any validation tests are necessary.

         

         

         

        Develop a plan for returning the alternate space being used to its normal occupants.

         

         

         

        Identify any work backlogs that may have developed. Develop a plan on how to address the backlogs.

         

         

         

        Follow procedures to ensure a timely and efficient transition of communications, direction and control and transfer of vital records and databases to the primary facility.

         

         

         

        Arrange to have supplies and equipment moved from alternate site.

         

         

         

        Set up permanent site.

         

         

         

        E. COOP family readiness resources

        COOP family readiness resources  

        Here are some useful resources available online, which can be provided to staff:

        Executive Management Team

        Details for the roles on the Executive Management Team.

        F. DoIT COOP Incident Commander (CIC) detail

        DoIT COOP Incident Commander (CIC) detail – also see KB 98185

        G. Executive Management Team checklist detail

        Executive Management Team checklist detail - also see KB 97141

        Executive Management Team checklist detail

         

        Tasks

        Point of contact

        Date/Time completed

        Phase 1: Initial notification and response

        Executive Management Team initiates recovery procedures

        Establishes the DoIT Emergency Operations Center (EOC)

         

         

        Declares an incident, if warranted

         

         

        Determines where recovery will take place.

         

         

        Initiates communications procedures.

         

         

        Activates Incident Response teams and ensures teams are properly staffed. See DoIT COOP, Section 3 - Roles and Responsibilities

         

         

        Phase 2: Assessment and activation

        Executive Management Team receives the following reports from IT service recovery teams, and provides guidance and decisions as needed.

        Initial damage assessments

         

         

        • Facility team reports initial damage assessment of primary data center structure and utility equipment within two hours of the declared disaster, and estimates time to complete detailed assessment.

         

         

        • Hardware, Network, and Infrastructure teams report initial damage assessment of network and infrastructure equipment within two hours of the declared disaster.

         

         

        • Hardware team estimates time to complete detailed assessment.

         

         

        Detailed damage assessments

         

         

        • Facility team reports structural damage to data centers and gives detailed utility equipment assessment.

         

         

        • Hardware team gives detailed network/infrastructure equipment assessment.

         

         

         Recovery estimates

         

         

        • Facility and Hardware teams report estimates for replacement and repair time including ordering, shipping, installing and testing equipment.

         

         

        • NetworkInfrastructure and Application Services teams report estimates for service recovery time.

         

         

        Additional comments:

         

         

         

        Phase 3: Recovery

        Executive Management Team manages ongoing prioritization of service recovery and resolution of issues.

        Additional comments:

         

         

         

         

        H. Communications consultant checklist detail

        Communications consultant checklist detail - also see KB 97318

        Communications consultant checklist detail

        Tasks

        Point of contact

        Date/Time completed

        Phase 1: Initial notification and response

        Communications consultant team manages communications with campus

        • Prepares initial communications and directives to DoIT staff, and informs staff if a decision is made to recover at the recovery site.

         

         

        • Prepares initial communications to customers and other stakeholders, in conjunction with the Executive Management Team and the Help Desk and Operations teams as appropriate.

         

         

        • Prepares an initial statement for distribution to the press.

         

         

        • Coordinates DoIT communications with communications activities by the campus crisis management team.

        • Coordinates incident response communications with University management and state agencies.

         

         

        Communications consultant team and Help Desk team coordinate messages for in-bound and out-bound contacts.

        Phase 2: Assessment and activation

        Communications consultant team manages ongoing communications with campus Operations team, Help Desk team and Communications consultant team.

        • Assess options for handling volume calls and coordinate solution.

         

         

        • Coordinate ongoing communications to various stakeholders.

         

         

        Additional comments:

         

         

         

        Phase 3: Recovery

        Communications consultant team manages ongoing communications with campus:

        • Receives updates from IT Service Recovery Director and Executive Management Team regarding progress and customer-related issues.

        • Prepares official messages to be disseminated by the Help Desk team.

        • Based on any expressed customer concerns, proposes adjustments to service recovery priorities for Executive Management Team consideration.

        Additional comments:

         

         

         

        Ongoing

        Communications Consultant is part of the Executive Management Team. See Executive Management Team checklist for details.

         

        I. Human Resources checklist detail

        Human Resources checklist detail - also see KB 97144

        Human Resources checklist detail

        Human Resources checklist detail table by COOP phase  

        Tasks

        Point of contact

        Date/Time completed

        Phase 1: Initial notification and response

        Incident Response team leaders prepare teams

        • Contact team members

          • See the IT Service Management tool WiscIT for individual tech staff phone number lookup via drop-down menu "Searches", choose Global, in field Association choose UserInfo, select AllUsers, select Lookup WiscIT team Member, choose team in the drop-down menu, then display the contact for the person you need, or

          • view the file 8-1-2022 DoIT - Crisis Response Contact Information in the cloud DoIT DR folder on BOX

        • Complete Log of contacts made

         

         

        • Assemble teams at suitable locations.

         

         

        • Report situation

         

        • Review team responsibilities and functions.

         

        •  Prioritize and direct next actions

        Human Resources team coordinates communication with staff, families, campus and state HR offices, and employee representatives.

        • Distributes readiness and family readiness resources to staff

        Additional comments

        Phase 2: Assessment and activation

        Human Resources team coordinates safety and support for staff.

        • Requests status of personnel on duty from managers and supervisors and informs Executive Management Team

         

         

        • Notifies families of staff working in the area.

         

         

        • Refers to HR emergency information forms as needed:
          • Primary Location: Stored on the DoIT LAN
          • Backup: Stored on flash drives

        • Establishes a family assistance center and a telephone hot-line for staff and families

        • Provides benefits information to staff and their families.

        • Refers family members to Employee Assistance and other counseling and support services.

        • Determines required staffing levels in consultation with Executive Management Team
          • Assists Incident Response team leaders with reassignment of staff member duties for recovery operations.

        • Notifies campus officials and works with the Department of Administration - Division of Personnel Management

        • Assists in coordination of work schedules and monitoring of team member working hours, to ensure employees' health and other needs.

        Additional comments:

         

         

         

        Phase 3: Recovery

        Continued: Human Resources team coordinates safety and support for staff.

        Additional comments:

         

         

         

        Ongoing

        Human Resources team lead is part of the Executive Management Team. See Executive Management checklist for details.

         

        IT Service Recovery team

        Details for the roles on the IT Service Recovery team.

        J. IT Service Recovery Director checklist detail

        IT Service Recovery Director checklist detail - also see KB 97328

        Tasks

        Point of contact

        Date/Time completed

        Phase 1: Initial notification and response

        IT Service Recovery Director oversees the SNCC and SEO initial response to incident.

        SNCC follows standard procedures

        • Senior Systems and Network Control Center (SNCC) Staff Member on duty responds to the incident:
          • Detects incident directly or via monitoring device
          • Assesses extent of incident
          • If emergency services are required
          • Calls (9) 911 for emergency services and physical security
          • Calls 3-3333 to arrange for Physical Plant facility services
          • Evacuates the facility if the situation threatens physical safety. At no time should the physical safety of employees be jeopardized
          • Manages emergency shutdown of services if possible
          • Contacts SEO Duty Manager with a situation report

         

         

         

         

        • SEO Duty Manager and SEO Duty Technologist consult with SEO Director, if possible.

        SEO follows procedures

        • SNCC Manager or SEO Director activates crisis management procedures:
          • Assembles Executive Management Team members at emergency operations center
            • Emergency Operations Center will be located at the DoIT CIO office, if secure.
            • Alternate location: office suite at the recovery site
          • Contacts Building Manager at recovery site for appropriate access to the building
            • DoIT Walnut Street Data Center is the primary recovery site for DoIT Dayton Street Data Center
            • DoIT Dayton Street Data Center is the primary recovery site for DoIT Walnut Street Data Center
          • Activates damage assessment procedures (see Phase 2 - Assessment and activation).
          • Activates Human Resources team to support staff and families.
          • Notifies Insurance team lead.

        IT Service Recovery Director ensures orderly communications among IT service recovery teams throughout the recovery process.

        Facility team instructions

        • Manages communications with emergency services, Physical Plant and other facility recovery personnel.

        Phase 2: Assessment and activation

        IT Service Recovery Director  identifies possible security risks to be assessed by the Security Consultant.

        IT Service Recovery Director oversees Service Recovery teams during initial damage assessment process:

        Facility team instruction

        Secures the facility for assessment by emergency services.

        • Works with the following groups to ensure physical safety and building security, prior to further damage assessment by DoIT personnel:
          • University police and other emergency personnel, in coordination with University Response Plan
          • Facility Building Manager
          • University Risk Management and Physical Plant personnel
        • Plans adjustments to facility access needs and restrictions during damage assessment with the Security Consultant
          • Assesses damage to door access control system, and converts to lock-and-key if necessary.

         

         

        Conducts preliminary damage assessment of primary data center structure and utility equipment.

        • Works with the following team leads and support personnel to review and coordinate damage assessment process:
          • Primary data center Building Manager
          • University Risk Management and Physical Plant personnel
          • Hardware, Network and Infrastructure teams
          • Logistics team
          • Insurance teams
          • Vendor hardware maintenance contractors

        Makes a preliminary estimate of replacement and repair time including ordering, shipping, installation, and testing. 

        Avoids further damage to the site or equipment and ensures Risk Management has noted damage to equipment before it is moved.

        Reports initial damage assessment findings to Executive Management Team within two hours of the declared disaster, and estimates time to complete detailed assessment.

        Security consultant / team instructions

        Assesses facility access needs and restrictions in consultation with Facility team:

        • Determines appropriate access to facility during damage assessment.
        • Provides a list of personnel to be admitted to the facility:
        • Submits list to UW Police Department if controls will be managed via human intervention.
        • Submits list to Facility team if controls will be managed via electronic access control system.

        Responds to urgent security incidents, as necessary.

        Facilitates IT security risk assessments on issues identified by the IT Service Recovery Team Director.

        Hardware, Network and Infrastructure teams conduct preliminary damage assessment of network and infrastructure equipment.

        Hardware/network/infrastructure instructions

        • Coordinate damage assessment with facility Building Manager, University Risk Management and vendor personnel as needed.
          • Avoid further damage to the site or utility equipment. Ensure Risk Management has noted damage to equipment before it is moved.
          • Note: Utility equipment is a Facility team responsibility; supplies and furniture are a Logistics team responsibility.

        • Use hardware damage checklists generated from the Configuration Management Data Base (CMDB) and the Network Change Management System (CMS).

        • Report initial damage assessment findings to Executive Management Team within two hours of the declared disaster.

        • Hardware team estimates time to complete detailed assessment.

        IT Service Recovery Director oversees service recovery teams during detailed damage assessment process:

        Facility team instructions

        Facility team conducts detailed assessment of primary data center structure and utility equipment:

        • Uses facility damage checklists generated from the Configuration Management Database (CMDB) and the Facility Planning & Management facility database.
        • Avoids further damage to the site(s) and utility equipment. Ensures Risk Management has noted damage to equipment before it is moved.
        • Assesses structural damage to data centers.
        • Conducts assessment of individual equipment components to determine which items will be used or salvaged. Consults with Procurement team as necessary.
        • Estimates replacement and repair time including ordering, shipping, installing and testing equipment.
        • Tags all usable equipment.
        • Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company.
        • Reports findings to Executive Management Team.
        • Sends detailed damage assessment lists to Procurement team.

        Hardware team instructions

        Hardware team conducts detailed assessment of network and infrastructure equipment:

        • Conducts assessment of individual service components to determine which items will be used or salvaged. Consults with Network, Infrastructure and Procurement teams as necessary.
        • Estimates replacement and repair time including ordering, shipping, installing and testing equipment.
        • Tags all usable equipment.
        • Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company.
        • Reports findings to Executive Management Team
        • Sends detailed damage assessment lists to Procurement team.

        Security team instructions

        Security Consultant:

        • Assesses facility access needs and restrictions in consultation with Facility team:
          • Determines appropriate access to facility during damage assessment.
          • Provides a list of personnel to be admitted to the facility:
          • Submits list to UW Police Department if controls will be managed via human intervention.
          • Submits list to Facility team if controls will be managed via electronic access control system.Assesses facility access needs and restrictions in consultation with Facility team:
        • Responds to urgent security incidents, as necessary.
        • Facilitates IT security risk assessments on issues identified by the IT Service Recovery Team Director.

        Network/infrastructure/application services teams instructions

        • Network, Infrastructure and Application Services teams estimate service recovery time and report findings to Executive Management Team.

        Operations/Help Desk instructions

        • Operations team, Help Desk team and communications consultant team:
          • Assess options for handling volume calls and coordinate solution.
          • Coordinate ongoing communications to various stakeholders.

        Additional comments:

         

         

         

        Phase 3: Recovery

        IT Service Recovery Director oversees service recovery teams during the recovery process:

        Facility team instructions

        Facility team prepares the recovery site:

        • Ensures adequate protection for equipment at the site before beginning any structural modifications.
        • Contacts Physical Plant to begin any necessary modifications.
        • Informs primary data center and recovery site Building Managers when work will be started.
        • Ensures that the recovery site is suitable for basic enterprise hosting prior to the delivery of equipment:
          • Verifies baseline levels of power, cooling, security, and structural safety
        • Coordinates ongoing facility access needs and restrictions during recovery with the security consultant.

        Security team instructions

        Security consultant:

        • Oversees ongoing implementation of facility access requirements in consultation with Facility team:
          • Determines any needed adjustments to facility access during recovery of services.
          • Provides updated list of personnel to be admitted to the facility to either the UW Police Department or the Facility team.
          • Ensures security controls are intact throughout the service recovery process.
        • Investigates IT security incidents that caused the disaster or are preventing the recovery of systems and services.
        • Works with IT service recovery teams to recover centralized security tools and resources.
          • Assists with the implementation of security controls (e.g. Tripwire, Antivirus).
        • Recovers network security configurations (e.g. CSA).
        • Provides authorization services for applications as needed.

        IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site:

        Details

        • Executive Management Team adjusts priorities for service recovery.
        • Network team
          • Manages priorities for network recovery as directed by the Executive Management Team.
          • Manages recovery of network components (switches, routers, network and security configurations) according to detailed service recovery plans. This documentation is maintained separately, and may include:
            • Current state business continuity capabilities
            • Planned/funded enhancements, and gaps for remediation consideration
            • Disaster recovery management and communications
            • Detailed recovery site strategy
            • Detailed hardware recovery plan
            • Switch, router and configuration recovery support
          • Identifies any hardware missing from pre-established service components hosted at the recovery site.
            • Submits requests for necessary hardware to Hardware team for expedited provisioning.
          • Coordinates with vendors for recovery of circuits and remote site connectivity.
          • Performs integration testing with Infrastructure and Application Services teams.
          • Works with security consultant to test recovered systems for security vulnerabilities.
          • Communicates progress and customer-related issues to the IT Service Recovery Team Director, who conveys messages to the communications consultant for appropriate dissemination.
          • Identifies and immediately escalates issues to Executive Management Team for decisions.

         Infrastructure team:

        • Manages priorities for service recovery as directed by Executive Management Team.
        • Recovers services according to detailed service recovery plans, which are maintained separately. Documentation may include:
          • Current state recovery capabilities
          • Planned/funded enhancements and gaps for remediation consideration
          • Disaster recovery management and communications plans
          • Detailed recovery site strategy
          • Detailed hardware recovery plan
          • Support for recovery of services and security controls
        • Validates and supports active infrastructure services hosted at the recovery site.
        • Identifies any hardware missing from pre-established service components hosted at the recovery site.
          • Submits requests for necessary hardware to Hardware team for expedited provisioning.
        • Recovers security controls in consultation with Security Consultant.
          • Tests recovered systems for security vulnerabilities.
        • Performs integration testing with Network team.
        • Communicates progress and customer-related issues to the IT Service Recovery Team Director, who conveys messages to the Communications Consultant for appropriate dissemination.
        • Identifies and immediately escalates issues to Executive Management Team for decisions.

        Application services team:

        • Manages priorities for service recovery as directed by Executive Management Team.
        • Recovers applications and data according to detailed recovery plans, which are maintained separately. Documentation may include:
          • Current state recovery capabilities
          • Planned/funded enhancements and gaps for remediation consideration
          • Disaster recovery management and communications plans
          • Detailed recovery site strategy
          • Detailed hardware recovery plan (refer to Infrastructure team plans as applicable)
          • Application and data recovery support
          • Output management recovery plan
        • Validates and supports active IT services and data hosted at the recovery site.
        • Identifies any hardware missing from pre-established service components hosted at the recovery site.
          • Submits requests for necessary hardware to Hardware team for expedited provisioning.
        • Performs integration testing in cooperation with the Network and Infrastructure teams.
        • Coordinates customer acceptance testing and troubleshooting for services and data.
        • Tests printing services recovery.
        • Tests recovered systems for security vulnerabilities in cooperation with the security consultant.
        • Coordinates other recovery activities with Network, Infrastructure and Hardware teams as needed.
        • Communicates progress and customer-related issues to the IT Service Recovery Team Director, who conveys messages to the communications consultant for appropriate dissemination.
        • Identifies and immediately escalates issues to Executive Management Team for decisions.

        Hardware team:

        • Receives critical hardware requests from IT service recovery teams and expedites provisioning of replacement hardware.
          • Works closely with Procurement team on all hardware purchases.
        • Identifies and immediately escalates issues to Executive Management Team for decisions.

        Operations team:

        • Operates equipment.
        • Supports handling/redirection of incoming customer calls to the SNCC.
        • Identifies and immediately escalates issues to Executive Management Team for decisions.

        Help Desk team:

        • Provides official messages from Communications Consultant to customers.
        • Reports incident level and severity on Help Desk (DoIT) and UW-Madison IT Outage pages.
        • IT service recovery teams restore all other levels of service:
            details

        Application Services, Network and Infrastructure teams work with Hardware team to complete hardware installation.

        Hardware team oversees installation of hardware:

        • Communicates hardware requirements to Procurement team.
        • Contacts vendors for repair assistance.  Oversees vendor diagnostics on equipment to ensure full reliability prior to installation at the recovery site.
        • Coordinates installation of hardware equipment at recovery site.
        • Works with Network, Infrastructure and Application Services teams to ensure all utility and access requirements are met and that cable connectors are correct.

        Application Services, Network and Infrastructure teams complete recovery activities according to individual service component recovery plans, which are maintained separately.

        •  Work with Operations team to return services to fully operational status.

        Operations team supports IT service recovery teams in restoring operating systems, applications and data.
        • Starts and operates recovered hardware.
        • Supports handling/redirection of incoming customer calls to the SNCC.
        • Operations team leader oversees startup according to priorities set by Executive Management Team:
          • Powerup, IML, and IPL procedures must be completed without error. Any problems are reported to the team leader and corrected by the vendor.

        Help Desk team:

        • Provides official messages from the communications consultant to customers.
        • Provides ongoing updates on incident level and severity on Help Desk (DoIT) and UW-Madison IT Outage pages
        • Restores IT services for the DoIT Help Desk
        • As needed, provides alternate Help Desk support according to Help Desk recovery plan, which is maintained separately.
          • Provides access to local version of Knowledge Base within Help Desk.
          • Activates ACD backup phone system or forwards phones.
          • Uses alternate instance or method of call tracking.
          • Uses alternate method to forward incident reports and service requests.

        Additional comments:

         

         

         

        Ongoing

        IT Service Recovery Director reports to and is a part of the Executive Management Team. See Executive Management checklist for details.

         

        K. Application Services checklist detail

        Application Services checklist detail - also see KB 97623

         

        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial Notification and Response

         

        • Contact team members
          • See the IT Service Management tool WiscIT for individual tech staff phone number lookup via drop-down menu "Searches", choose Global, in field Association choose UserInfo, select AllUsers, select Lookup WiscIT Team Member, choose team in the drop-down menu, then display the contact for the person you need, or
          • view the file 8-1-2022 DoIT - Crisis Response Contact Information in the cloud DoIT DR folder on BOX
        • Complete Log of contacts made.

         

         

        Assemble teams at suitable locations.

         

         

        Report situation.

         

         

        Review team responsibilities and functions.

         

         

        Prioritize and direct next actions.

         

         

        Additional comments:

         

         

        Phase 2: Assessment and activation

         

        Network, Infrastructure and Application Services teams estimate service recovery time and report findings to Executive Management Team

         

         

        Additional comments:

         

         

        Phase 3: Recovery

        IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site:

        Application Services team:

         

        Manages priorities for service recovery as directed by Executive Management Team.

         

         

        Recovers applications and data according to detailed recovery plans, which are maintained separately. Documentation may include:

        • Current state recovery capabilities
        • Planned/funded enhancements and gaps for remediation consideration
        • Disaster recovery management and communications plans
        • Detailed recovery site strategy
        • Detailed hardware recovery plan (refer to Infrastructure Team plans as applicable)
        • Application and data recovery support
        • Output management recovery plan

         

         

        Validates and supports active IT services and data hosted at the recovery site.

         

         

        Identifies any hardware missing from pre-established service components hosted at the recovery site.

        • Submits requests for necessary hardware to Hardware team for expedited provisioning.

         

         

        Performs integration testing in cooperation with the Network and Infrastructure teams.

         

         

        Coordinates customer acceptance testing and troubleshooting for services and data.

         

         

        Tests printing services recovery.

         

         

        Tests recovered systems for security vulnerabilities in cooperation with Security Consultant.

         

         

        Coordinates other recovery activities with Network, Infrastructure and Hardware teams as needed.

         

         

        Communicates progress and customer-related issues to the IT Service Recovery Team Director, who conveys messages to the Communications Consultant for appropriate dissemination.

         

         

        Identifies and immediately escalates issues to Executive Management Team for decisions.

         

        IT service recovery teams restore all other levels of service

        Application ServicesNetwork and Infrastructure teams work with Hardware team to complete hardware installation.

         

        Hardware team oversees installation of hardware:

        • Communicates hardware requirements to Procurement team.
        • Contacts vendors for repair assistance. Oversees vendor diagnostics on equipment to ensure full reliability prior to installation at the recovery site.
        • Coordinates installation of hardware equipment at recovery site.
        • Works with Network, Infrastructure and Application Services teams to ensure all utility and access requirements are met and that cable connectors are correct.

         

        Application ServicesNetwork and Infrastructure teams complete recovery activities according to individual service component recovery plans, which are maintained separately.

        Work with Operations team to return services to fully operational status.

         

        Operations team supports IT service recovery teams in restoring operating systems, applications and data.

        • Starts and operates recovered hardware.
        • Supports handling/redirection of incoming customer calls to the SNCC.
        • Operations team leader oversees startup according to priorities set by Executive Management Team
          • Powerup, IML, and IPL procedures must be completed without error. Any problems are reported to the team leader and corrected by the vendor

         

         

        Additional comments:

         

         

        L. Facility checklist detail

        Facility checklist detail - also see KB 97143

        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial Notification and Response

        Incident response team leaders prepare teams

         
        • Contact team members
          • See the IT Service Management tool WiscIT for individual tech staff phone number lookup via drop-down menu "Searches", choose Global, in field Association choose UserInfo, select AllUsers, select Lookup WiscIT Team Member, choose team in the drop-down menu, then display the contact for the person you need, or
          • view the file 8-1-2022 DoIT - Crisis Response Contact Information in the cloud DoIT DR folder on BOX
        • Complete Log of contacts made.
         
         

        Assemble teams at suitable locations.

         
         

        Report situation.

         
         

        Review team responsibilities and functions.

         
         

        Prioritize and direct next actions.

         

        Facility team manages communications with emergency services, Physical Plant and other facility recovery personnel.

         

        Additional comments:

         

         

        Phase 2: Assessment and activation

        Facility team secures the facility for assessment by emergency services.

         

        Works with the following groups to ensure physical safety and building security, prior to further damage assessment by DoIT personnel:

        • University police and other emergency personnel, in coordination with University Response Plan
        • Facility Building Manager
        • University Risk Management and Physical Plant personnel
         
         

        Plans adjustments to facility access needs and restrictions during damage assessment with the Security Consultant.

        • Assesses damage to door access control system, and converts to lock-and-key if necessary.
         

        Facility team conducts preliminary damage assessment of primary data center structure and utility equipment.

         

        Works with the following team leads and support personnel to review and coordinate damage assessment process:

        • Primary data center Building Manager
        • University Risk Management and Physical Plant personnel
        • Hardware, Network and Infrastructure teams
        • Logistics team
        • Insurance teams
        • Vendor hardware maintenance contractors
         
         

        Makes a preliminary estimate of replacement and repair time including ordering, shipping, installation, and testing.

         
         

        Avoids further damage to the site or equipment and ensures Risk Management has noted damage to equipment before it is moved

         
         

        Reports initial damage assessment findings to Executive Management Team within two hours of the declared disaster, and estimates time to complete detailed assessment.

         

        Facility team conducts detailed assessment of primary data center structure and utility equipment

         

        Uses facility damage checklists generated from the Configuration Management Data Base (CMDB) and the Facility Planning & Management facility database.

         
         

        Avoids further damage to the site(s) and utility equipment. Ensures Risk Management has noted damage to equipment before it is moved.

         
         

        Assesses structural damage to data centers.

         
         

        Conducts assessment of individual equipment components to determine which items will be used or salvaged. Consults with Procurement team as necessary.

         
         

        Estimates replacement and repair time including ordering, shipping, installing and testing equipment

         
         

        Tags all usable equipment.

         
         

        Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company

         
         

        Reports findings to Executive Management Team.

         
         

        Sends detailed damage assessment lists to Procurement team.

         
         

        Additional comments:

         

         

        Phase 3: Recovery

        Facility team prepares the recovery site

         

        Ensures adequate protection for equipment at the site before beginning any structural modifications.

         
         

        Contacts Physical Plant to begin any necessary modifications.

         
         

        Informs primary data center and recovery site Building Managers when work will be started.

         
         

        Ensures that the recovery site is suitable for basic enterprise hosting prior to the delivery of equipment:

         
         

        Verifies baseline levels of power, cooling, security, and structural safety

         
         

        Coordinates ongoing facility access needs and restrictions during recovery with the Security Consultant.

         
         

        Additional comments:

         

        M. Hardware checklist detail

        Hardware checklist detail - also see KB 97320

         

        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial Notification and Response

        Incident Response Team Leaders prepare teams

         

        • Contact team members
          • See the IT Service Management tool WiscIT for individual tech staff phone number lookup via drop-down menu "Searches", choose Global, in field Association choose UserInfo, select AllUsers, select Lookup WiscIT Team Member, choose team in the drop-down menu, then display the contact for the person you need, or
          • view the file 8-1-2022 DoIT - Crisis Response Contact Information\ in the cloud DoIT DR folder on BOX
        • Complete Log of contacts made.

         

         

        Assemble teams at suitable locations.

         

         

        Report situation.

         

         

        Review team responsibilities and functions.

         

         

        Prioritize and direct next actions.

         

        Phase 2: Assessment and activation

        Hardware, Network and Infrastructure teams conduct preliminary damage assessment of network and infrastructure equipment.

         

        Coordinate damage assessment with facility Building Manager, University Risk Management and vendor personnel as needed.

        • Avoid further damage to the site or utility equipment. Ensure Risk Management has noted damage to equipment before it is moved.
        • Note: Utility equipment is a Facility team responsibility; supplies and furniture are a Logistics team responsibility.

         

         

        Use hardware damage checklists generated from the Configuration Management Data Base (CMDB) and the Network Change Management System (CMS).

         

         

        Report initial damage assessment findings to Executive Management Team within two hours of the declared disaster

         

         

        Hardware team estimates time to complete detailed assessment

         

        Hardware team conducts detailed assessment of network and infrastructure equipment:

         

        Conducts assessment of individual service components to determine which items will be used or salvaged. Consults with Network, Infrastructure and Procurement teams as necessary

         

         

        Estimates replacement and repair time including ordering, shipping, installing and testing equipment.

         

         

        Tags all usable equipment.

         

         

        Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company

         

         

        Reports findings to Executive Management Team

         

         

        Sends detailed damage assessment lists to Procurement team.

         

         

        Additional comments:

         

         

        Phase 3: Recovery

        IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site:

        Hardware team

         

        Receives critical hardware requests from IT service recovery teams and expedites provisioning of replacement hardware.

         

         

        Works closely with Procurement team on all hardware purchases.

         

         

        Identifies and immediately escalates issues to Executive Management Team for decisions.

         

        IT service recovery teams restore all other levels of service:

        Hardware team

         

        Communicates hardware requirements to Procurement team.

         

         

        Contacts vendors for repair assistance.  Oversees vendor diagnostics on equipment to ensure full reliability prior to installation at the recovery site.

         

         

        Coordinates installation of hardware equipment at recovery site.

         

         

        Works with Network, infrastructure and Application Services teams to ensure all utility and access requirements are met and that cable connectors are correct.

         

         

        Additional comments:

         

         

        N. Infrastructure checklist detail

        Infrastructure checklist detail - also see KB 97322

        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial Notification and Response

         

        • Contact team members
          • See the IT Service Management tool WiscIT for individual tech staff phone number lookup via drop-down menu "Searches", choose Global, in field Association choose UserInfo, select AllUsers, select Lookup WiscIT Team Member, choose team in the drop-down menu, then display the contact for the person you need, or
          • view the file 8-1-2022 DoIT - Crisis Response Contact Information in the cloud DoIT DR folder on BOX
        • Complete Log of contacts made.

         

         

        Assemble teams at suitable locations.

         

         

        Report situation.

         

         

        Review team responsibilities and functions.

         

         

        Prioritize and direct next actions.

         

         

        Additional comments:

         

         

        Phase 2: Assessment and activation

        Hardware, Network and Infrastructure teams conduct preliminary damage assessment of network and infrastructure equipment.

         

        Coordinate damage assessment with facility Building Manager, University Risk Management and vendor personnel as needed.

        • Avoid further damage to the site or utility equipment. Ensure Risk Management has noted damage to equipment before it is moved.

        Note: Utility equipment is a Facility team responsibility; supplies and furniture are a Logistics team responsibility.

         

         

        Use hardware damage checklists generated from the Configuration Management Data Base (CMDB) and the Network Change Management System (CMS).

         

         

        Report initial damage assessment findings to Executive Management Team within two hours of the declared disaster.

         

         

        Hardware team estimates time to complete detailed assessment.

         

        Hardware team conducts detailed assessment of network and infrastructure equipment:

         

        Conducts assessment of individual service components to determine which items will be used or salvaged. Consults with Network, Infrastructure and Procurement teams as necessary

         

         

        Estimates replacement and repair time including ordering, shipping, installing and testing equipment.

         

         

        Tags all usable equipment.

         

         

        Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company

         

         

        Reports findings to Executive Management Team

         

         

        Sends detailed damage assessment lists to Procurement team.

         

        NetworkInfrastructure and Application Services teams estimate service recovery time and report findings to Executive Management Team

         

        Additional comments:

         

         

        Phase 3: Recovery

        IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site:

        Infrastructure team:

         

        Manages priorities for service recovery as directed by Executive Management Team.

         

         

        Recovers applications and data according to detailed recovery plans, which are maintained separately. Documentation may include:

        • Current state recovery capabilities
        • Planned/funded enhancements and gaps for remediation consideration
        • Disaster recovery management and communications plans
        • Detailed recovery site strategy
        • Detailed hardware recovery plan (refer to Infrastructure team plans as applicable)
        • Application and data recovery support
        • Output management recovery plan

         

         

        Validates and supports active IT services and data hosted at the recovery site.

         

         

        Identifies any hardware missing from pre-established service components hosted at the recovery site.

        • Submits requests for necessary hardware to Hardware team for expedited provisioning.

         

         

        Recovers security controls in consultation with the security consultant

        • Tests recovered systems for security vulnerabilities

         

         

        Performs integration testing with the Network team.

         

         

        Communicates progress and customer-related issues to the IT Service Recovery Team Director, who conveys messages to the Communications Consultant for appropriate dissemination.

         

         

        Identifies and immediately escalates issues to Executive Management Team for decisions.

         

        IT service recovery teams restore all other levels of service

        Application ServicesNetwork and Infrastructure teams work with Hardware team to complete hardware installation.

         

        Hardware team oversees installation of hardware:

        • Communicates hardware requirements to Procurement team.
        • Contacts vendors for repair assistance.  Oversees vendor diagnostics on equipment to ensure full reliability prior to installation at the recovery site.
        • Coordinates installation of hardware equipment at recovery site.
        • Works with Network, Infrastructure and Application Services teams to ensure all utility and access requirements are met and that cable connectors are correct.

         

        Application ServicesNetwork and Infrastructure teams complete recovery activities according to individual service component recovery plans, which are maintained separately.

        Work with Operations team to return services to fully operational status.

         

        Operations team supports IT service recovery teams in restoring operating systems, applications and data.

        • Starts and operates recovered hardware.
        • Supports handling/redirection of incoming customer calls to the SNCC.
        • Operations team leader oversees startup according to priorities set by Executive Management Team
          • Powerup, IML, and IPL procedures must be completed without error. Any problems are reported to the team leader and corrected by the vendor

         

         

        Additional comments:

         

         

        O. Network checklist detail

        Network checklist detail - also see KB 97337

        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial Notification and Response

         
        • Contact team members
          • See the IT Service Management tool WiscIT for individual tech staff phone number lookup via drop-down menu "Searches", choose Global, in field Association choose UserInfo, select AllUsers, select Lookup WiscIT Team Member, choose team in the drop-down menu, then display the contact for the person you need, or
          • view the file 8-1-2022 DoIT - Crisis Response Contact Information in the cloud DoIT DR folder on BOX
        • Complete Log of contacts made.
         
         

        Assemble teams at suitable locations.

         
         

        Report situation.

         
         

        Review team responsibilities and functions.

         
         

        Prioritize and direct next actions.

         
         

        If necessary, reference the Network-specific COOP Plan in the DR folder “wiscit” on BOX

         
         

        Additional comments:

         

        Phase 2: Assessment and activation

        Hardware, Network, and Infrastructure teams conduct preliminary damage assessment of network and infrastructure equipment.

         

        Coordinate damage assessment with facility Building Manager, University Risk Management and vendor personnel as needed.

        • Avoid further damage to the site or utility equipment. Ensure Risk Management has noted damage to equipment before it is moved.

        Note: Utility equipment is a Facility team responsibility; supplies and furniture are a Logistics team responsibility.

         
         
        • Use hardware damage checklists generated from the Configuration Management Data Base (CMDB) and the Network Change Management System (CMS).
         
         

        Report initial damage assessment findings to Executive Management Team within two hours of the declared disaster.

         

        Hardware team conducts detailed assessment of network and infrastructure equipment.

         

        Conducts assessment of individual service components to determine which items will be used or salvaged. Consults with NetworkInfrastructure and Procurement teams as necessary.

         
         

        Estimates replacement and repair time including ordering, shipping, installing and testing equipment.

         
         

        Tags all usable equipment.

         
         

        Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company.

         
         

        Reports findings to Executive Management Team

         
         

        Sends detailed damage assessment lists to Procurement team.

         

        NetworkInfrastructure and Application Services teams estimate service recovery time and report findings to Executive Management Team.

         

        Additional comments:

         

         

        Phase 3: Recovery

        IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site:

        IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site:

        Network team

         

        Manages priorities for network recovery as directed by the Executive Management Team.

         
         

        Manages recovery of network components (switches, routers, network and security configurations) according to detailed service recovery plans. This documentation is maintained separately, and may include:

        • Current state business continuity capabilities
        • Planned/funded enhancements, and gaps for remediation consideration
        • Disaster recovery management and communications
        • Detailed recovery site strategy
        • Detailed hardware recovery plan
        • Switch, router and configuration recovery support
         
         

        Identifies any hardware missing from pre-established service components hosted at the recovery site.

        • Submits requests for necessary hardware to Hardware team for expedited provisioning.
         
         

        Coordinates with vendors for recovery of circuits and remote site connectivity.

         
         

        Performs integration testing with Infrastructure and Application Services teams.

         
         

        Works with the security consultant to test recovered systems for security vulnerabilities.

         
         

        Communicates progress and customer-related issues to the IT Service Recovery Team Director, who conveys messages to the communications consultant team for appropriate dissemination.

         
         

        Identifies and immediately escalates issues to Executive Management Team for decisions.

         

        IT service recovery teams restore all other levels of service

        Application ServicesNetwork and Infrastructure teams work with Hardware team to complete hardware installation.

         

        Hardware team oversees installation of hardware:

        • Communicates hardware requirements to Procurement team.
        • Contacts vendors for repair assistance. Oversees vendor diagnostics on equipment to ensure full reliability prior to installation at the recovery site.
        • Coordinates installation of hardware equipment at recovery site.
        • Works with Network, Infrastructure and Application Services teams to ensure all utility and access requirements are met and that cable connectors are correct.
         

        Application ServicesNetwork and Infrastructure teams complete recovery activities according to individual service component recovery plans, which are maintained separately.

        Work with Operations team to return services to fully operational status.

         

        Operations team supports IT service recovery teams in restoring operating systems, applications and data.

        • Starts and operates recovered hardware.
        • Supports handling/redirection of incoming customer calls to the SNCC.
        • Operations team leader oversees startup according to priorities set by Executive Management Team
        • Powerup, IML, and IPL procedures must be completed without error. Any problems are reported to the team leader and corrected by the vendor
         
         

        Additional comments:

         

         

        P. Operations checklist detail

        Operations checklist detail - also see KB 97363

        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial Notification and Response

        SNCC follows Standard Procedures

        Senior Systems and Network Control Center (SNCC) Staff Member on duty responds to the incident:

         

        Detects incident directly or via monitoring device.

         

         

        Assesses extent of incident.

        ·      If emergency services are required:

          • Calls (9) 911 for emergency services and physical security.
          • Calls 3-3333 to arrange for Physical Plant facility services.
          • Evacuates the facility if the situation threatens physical safety. At no time should the physical safety of employees be jeopardized.

         

         

        Manages emergency shutdown of services if possible.

         

         

        Contacts SEO Duty Manager with a situation report.

         

        SEO Duty Manager:

         

        Accounts for all personnel who were on site during any incident jeopardizing human safety, and notifies Human Resources.

         

         

        Follows "SEO Vertical Escalation and Notification" procedure and engages responsible Situation Manager per KB #3632 (SEO Internal KnowledgeBase).

         

         

        Engages SEO On-Call Technologists per KB #80924 (SEO Internal KnowledgeBase).

         

        SEO Duty Manager and SEO Duty Technologist consult with SEO Director, if possible.

         

        Incident response team leaders prepare teams:

         

         

        • Contact team members
          • See the IT Service Management tool WiscIT for individual tech staff phone number lookup via drop-down menu "Searches", choose Global, in field Association choose UserInfo, select AllUsers, select Lookup WiscIT Team Member, choose team in the drop-down menu, then display the contact for the person you need, or
          • view the file 8-1-2022 DoIT - Crisis Response Contact Information in the cloud DoIT DR folder on BOX
        • Complete Log of contacts made.

         

         

        Assemble teams at suitable locations.

         

         

        Report situation.

         

         

        Review team responsibilities and functions.

         

         

        Prioritize and direct next actions.

         

         

        Additional comments:

         

         

        Phase 2: Assessment and activation

        Operations teamHelp Desk team and communications consultant team:

         

        Assess options for handling volume calls and coordinate solution.

         

         

        Coordinate ongoing communications to various stakeholders.

         

         

        Additional comments

         

         

        Phase 3: Recovery

        IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site:

         

        Operations team:

         

        Operates equipment.

         

         

        Supports handling/redirection of incoming customer calls to the SNCC.

         

         

        Identifies and immediately escalates issues to Executive Management Team for decisions.

         

        IT service recovery teams restore all other levels of service:

        Operations team supports IT service recovery teams in restoring operating systems, applications and data.

         

        Starts and operates recovered hardware.

         

         

        Supports handling/redirection of incoming customer calls to the SNCC.

         

         

        Operations team leader oversees startup according to priorities set by Executive Management Team:

        • Powerup, IML, and IPL procedures must be completed without error. Any problems are reported to the team leader and corrected by the vendor.

         

         

        Additional comments

         

         

         

          

        Q. Help Desk checklist detail

        Help Desk checklist detail - also see KB 97321

         

        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial notification and response

        Incident response team leaders prepare teams:

         

        • Contact team members
          • See the IT Service Management tool WiscIT for individual tech staff phone number lookup via drop-down menu "Searches", choose Global, in field Association choose UserInfo, select AllUsers, select Lookup WiscIT Team Member, choose team in the drop-down menu, then display the contact for the person you need, or
          • view the file 8-1-2022 DoIT - Crisis Response Contact Information in the cloud DoIT DR folder on BOX
        • Complete Log of contacts made.

         

         

        Assemble teams at suitable locations.

         

         

        Report situation.

         

         

        Review team responsibilities and functions.

         

         

        Prioritize and direct next actions.

         

         

        If deemed necessary, reference the Network-specific COOP Plan in the DR folder “wiscit” on Box

         

        Communications consultant team and Help Desk team coordinate messages for in-bound and out-bound contacts.

         

        Additional comments:

         

         

        Phase 2: Assessment and activation

        Operations team, Help Desk team and communications consultant team:

         

        Assess options for handling volume calls and coordinate solution.

         

         

        Coordinate ongoing communications to various stakeholders.

         

         

        Additional comments:

         

         

        Phase 3: Recovery

        IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site:

        IT service recovery teams restore critical Level 1 and Level 2 services that have pre-established service components hosted at the recovery site:

        Help Desk team:

         

        Provides official messages from communications consultant team to customers.

         

         

        Reports incident level and severity on Help Desk (DoIT) and UW-Madison IT Outage pages.

         

        IT service recovery teams restore all other levels of service: 

        Help Desk team:

         

        Provides official messages from communications consultant to customers.

         

         

        Provides ongoing updates on incident level and severity on Help Desk (DoIT) and UW-Madison IT Outage pages

         

         

        Restores IT services for the DoIT Help Desk

         

         

        As needed, provides alternate Help Desk support according to Help Desk recovery plan, which is maintained separately.

        • Provides access to local version of Knowledge Base within Help Desk.
        • Activates ACD backup phone system or forwards phones.
        • Uses alternate instance or method of call tracking.
        • Uses alternate method to forward incident reports and service requests.

         

         

         

        Additional comments

         

         

         

        R. Physical Security Consultant checklist detail

        Physical Security Consultant checklist detail - also see KB 97376

         

        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial Notification and Response

         

        • Contact team members
          • See the IT Service Management tool WiscIT for individual tech staff phone number lookup via drop-down menu "Searches", choose Global, in field Association choose UserInfo, select AllUsers, select Lookup WiscIT Team Member, choose team in the drop-down menu, then display the contact for the person you need, or
          • view the file 8-1-2022 DoIT - Crisis Response Contact Information in the cloud DoIT DR folder on BOX
        • Complete Log of contacts made.

         

         

        Assemble teams at suitable locations.

         

         

        Report situation.

         

         

        Review team responsibilities and functions.

         

         

        Prioritize and direct next actions.

         

         

        Additional comments:

         

         

        Phase 2: Assessment and activation

        Security Consultant:

         

        Assesses facility access needs and restrictions in consultation with Facility team:

        • Determines appropriate access to facility during damage assessment.
        • Provides a list of personnel to be admitted to the facility:
          • Submits list to UW Police Department if controls will be managed via human intervention.
          • Submits list to Facility team if controls will be managed via electronic access control system.

         

         

        Responds to urgent security incidents, as necessary.

         

         

        Facilitates IT security risk assessments on issues identified by the IT Service Recovery Team Director.

         

        Phase 3: Recovery

        Security Consultant:

         

        Oversees ongoing implementation of facility access requirements in consultation with Facility team:

        • Determines any needed adjustments to facility access during recovery of services.
        • Provides updated list of personnel to be admitted to the facility to either the UW Police Department or the Facility team.

         

         

        Investigates IT security incidents that caused the disaster or are preventing the recovery of systems and services.

         

         

        Works with IT service recovery teams to recover centralized security tools and resources.

        • Assists with the implementation of security controls (e.g. Tripwire, Antivirus).

         

         

        Recovers network security configurations (e.g. CSA).

         

         

        Provides authorization services for applications as needed.

         

         

        Additional comments

         

         

        Ongoing:

        Security consultant is part of the Executive Management Team. See Executive Management checklist for details.

         

        Additional comments

         

         

        S. Cybersecurity Incident Response Procedures

        See Cybersecurity Incident Response Procedures - authorization required for access.

        Administrative team

        Details for the roles on the Administrative team.

        T. Administrative Director checklist detail

        Administrative Director checklist detail - also see KB 99675

        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial notification and response

        During initial response, SNCC Manager or SEO Director contacts and assembles Team Leaders and Executive Management Team.

        SNCC NOC Manager or SEO Director activates crisis management procedures:

         

        Assembles Executive Management Team members at emergency operations center.

        • Emergency Operations Center will be located at the DoIT CIO office, if secure.
        • Alternate location: office suite at the recovery site or virtual by unanimous Executive Management Team decision.

         

         

        Contacts Building Manager at recovery site for appropriate access to the building.

        • DoIT's Walnut Street Data Center is the recovery site for DoIT's Dayton Street Data Center
        • DoIT's Dayton Street Data Center is the recovery site for DoIT's Walnut Street Data Center

        Assesses extent of incident.

        • If emergency services are required:
          • Calls (9) 911 for emergency services and physical security.
          • Calls 3-3333 to arrange for Physical Plant facility services.
          • Evacuates the facility if the situation threatens physical safety. At no time should the physical safety of employees be jeopardized.

         

         

        Activates damage assessment procedures (see under Phase 2 - Assessment and activation).

         

         

        Activates Human Resources team to support staff and families.

         

         

        Notifies Insurance team lead.

         

        Incident response teams communicate with stakeholders:

         

        Insurance team handles communications with campus Risk Management.

         

         

        Procurement team handles communication with vendors and University/State procurement offices.

         

         

        Additional comments:

         

         

        Phase 2: Assessment and activation

        Administrative Director oversees Administrative teams during damage assessment process:

        Logistics team assesses damage to furniture and supplies:

         

        Works with Facility team to conduct damage assessments.

         

         

        Assesses damage to supplies including tapes, miscellaneous equipment, support media and materials.

         

         

        Sends detailed damage assessment lists to Procurement team to arrange for replacements.

         

        Procurement team consults with Facility, Hardware and Logistics teams during damage assessment:

         

        Facility team conducts detailed assessment of primary data center structure and utility equipment:

        • Uses facility damage checklists generated from the Configuration Management Data Base (CMDB) and the Facility Planning & Management facility database.
        • Avoids further damage to the site(s) and utility equipment. Ensures Risk Management has noted damage to equipment before it is moved.
        • Assesses structural damage to data centers.
        • Conducts assessment of individual equipment components to determine which items will be used or salvaged. Consults with Procurement team as necessary.
        • Estimates replacement and repair time including ordering, shipping, installing and testing equipment.
        • Tags all usable equipment.
        • Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company.
        • Reports findings to Executive Management Team.
        • Sends detailed damage assessment lists to Procurement team.

         

         

        Hardware team conducts detailed assessment of network and infrastructure equipment:

        • Conducts assessment of individual service components to determine which items will be used or salvaged. Consults with NetworkInfrastructure and Procurement teams as necessary.
        • Estimates replacement and repair time including ordering, shipping, installing and testing equipment.
        • Tags all usable equipment.
        • Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company.
        • Reports findings to Executive Management Team
        • Sends detailed damage assessment lists to Procurement team.

         

         

        Logistics team assesses damage to furniture and supplies:

        • Works with Facility team to conduct damage assessments.
        • Assesses damage to supplies including tapes, miscellaneous equipment, support media and materials.
        • Sends detailed damage assessment lists to Procurement team to arrange for replacements.

         

         

        Insurance team manages financial loss:

        • Coordinates damage assessment with Facility team.
        • Works closely with Risk Management on any expenditures related to the claim to ensure that knowledgeable decisions are made.
          • Note: Risk Management maintains an insurance loss escrow account that can be used for requisitions to fund repairs and replacement of equipment before the claim is actually filed with the State.
        • Conducts a loss determination for building repairs and damaged equipment including depreciation, departmental labor, and supplies.

         

         

        Finance team:

        • Manages incident response expenditures.
        • Accounts for incident response expenses.

         

         

        Additional comments

         

         

        Phase 3: Recovery

        Administrative Director oversees Administrative teams during recovery.

        Procurement team purchases replacement hardware

         

        Receives detailed damage assessment lists from HardwareFacility and Logistics teams.

         

         

        Reviews State of Wisconsin, University System and UW-Madison contract records to determine current contract vendors.

         

         

        Coordinates team member activities to minimize duplication of vendor contacts.

         

         

        Reviews any quick-ship contracts DoIT may have established as a result of individual service component recovery plans.

        • Executes quick-ship delivery per applicable contracts.

         

         

        Divides all equipment for which there is no quick-ship contract into three categories based on the following definitions:

        • Category 1:  All equipment purchased new from a vendor who still has some form of purchasing agreement for the same or similar equipment in force with the University or another state agency.
          • Equipment in this category may be purchased quickly because negotiations are not necessary.
          • Similar equipment is an acceptable alternative for the exact model, if the new equipment represents the same vendor's newer model and if no conversion would be necessary.
        • Category 2:  Equipment that meets the following two requirements:
          • Requirement 1: Equipment is not readily available from the original vendor, either because the vendor no longer handles it or because that vendor no longer has a valid purchasing agreement in force with the state.
          • Requirement 2: Similar equipment is available from other vendors with purchase authority pre-established by virtue of existing bulletins, contracts, bids, or waivers.
          • Technical evaluation is required to determine what available equipment would make a satisfactory substitute and to ensure that the equipment layout for the recovery site is appropriately modified.
        • Category 3:  Equipment that meets either of the following two requirements:
          • Requirement 1:  Same or similar equipment is no longer available from any vendor.
          • Requirement 2:  Same or similar equipment is only available from vendors that do not have a valid purchasing agreement in force with the state.
          • Hardware team leader must seek concurrence of the IT Service Recovery Team Director and the Procurement team leader for the purchase of Category 3 equipment.  The negotiation/acquisition process for such equipment also requires that a Governor's waiver be granted.

         

         

        Establishes priorities for acquiring equipment.

         

         

        Contacts vendors to determine availability of equipment, costs, and expected delivery dates.

        • Issues requisitions, specifying pertinent equipment features, ITR #, and bid, bulletin, contract, or waiver number.
        • Reports purchasing actions to Hardware team leader and Administrative Team Director.

         

         

        Category 3 Equipment: Once a vendor has been identified, the Hardware team leader provides Procurement team with detailed list of equipment to be ordered, identifying it as a category 3 purchase. Procurement team then expedites the Governor's waiver process and proceeds with acquisitions after waivers are granted.

         

         

        For equipment purchased on the used market, Procurement team ensures that third-party vendors provide installation contracts and certify equipment to be fully operable and eligible for maintenance agreements.

         

        Logistics team arranges equipment and supplies transport:

         

        If equipment is to be moved off-site, Logistics team leader secures temperature-controlled moving vans for equipment transfer.

         

         

        Delivers offsite materials

        • Tapes
        • Incident Response plans and checklists
        • Equipment supplies inventory
        • Equipment operating manuals
        • Support media and materials
         

         

         

        Transports salvageable furniture and supplies to recovery site.

         

         

        Additional comments

         

         

        Ongoing:

        Administrative Team Director reports to and is a part of the Executive Management Team. See Executive Management checklist for details

         

        U. Finance checklist detail

        Finance checklist detail - also see KB 97319

        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial notification and response

        Incident response team leaders prepare teams:

         
        • Contact team members
          • See the IT Service Management tool WiscIT for individual tech staff phone number lookup via drop-down menu "Searches", choose Global, in field Association choose UserInfo, select AllUsers, select Lookup WiscIT Team Member, choose team in the drop-down menu, then display the contact for the person you need, or
          • view the file 8-1-2022 DoIT - Crisis Response Contact Information in the cloud DoIT DR folder on BOX
        • Complete Log of contacts made.
         
         

        Assemble teams at suitable locations.

         
         

        Report situation.

         
         

        Review team responsibilities and functions.

         
         

        Prioritize and direct next actions.

         
         

        Additional comments:

         

        Phase 2: Assessment and activation

        Finance team:

         

        Manages incident response expenditures.

         
         

        Accounts for incident response expenses.

         
         

        Additional comments

         

         

        V. Insurance checklist detail

        Insurance checklist detail - also see KB 97324


        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial notification and response

        Incident response team leaders prepare teams:

         

         

         

        Assemble teams at suitable locations.

         

         

        Report situation.

         

         

        Review team responsibilities and functions.

         

         

        Prioritize and direct next actions.

         

        Incident response teams communicate with stakeholders:

         

        Insurance team handles communications with Risk Management.

         

         

        Additional comments:

         

         

        Phase 2: Assessment and activation

        Insurance team manages financial loss detail:

         

        Coordinates damage assessment with Facility team.

         

         

        Works closely with Risk Management on any expenditures related to the claim to ensure that knowledgeable decisions are made.

        • Note: Risk Management maintains an insurance loss escrow account that can be used for requisitions to fund repairs and replacement of equipment before the claim is actually filed with the State.

         

         

        Conducts a loss determination for building repairs and damaged equipment including depreciation, departmental labor, and supplies.

         

         

        Additional comments

         

         

         

        W. Logistics checklist detail

        Logistics checklist detail - also see KB 97335

         

        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial notification and response

        Incident response team leaders prepare teams:

         

        • Contact team members
          • See the IT Service Management tool WiscIT for individual tech staff phone number lookup via drop-down menu "Searches", choose Global, in field Association choose UserInfo, select AllUsers, select Lookup WiscIT Team Member, choose team in the drop-down menu, then display the contact for the person you need, or
          • view the file 8-1-2022 DoIT - Crisis Response Contact Information in the cloud DoIT DR folder on BOX
        • Complete Log of contacts made.

         

         

        Assemble teams at suitable locations.

         

         

        Report situation.

         

         

        Review team responsibilities and functions.

         

         

        Prioritize and direct next actions.

         

         

        Additional comments:

         

         

        Phase 2: Assessment and activation

        Logistics team assesses damage to furniture and supplies:

         

        Works with Facility team to conduct damage assessments.

         

         

        Assesses damage to supplies including tapes, miscellaneous equipment, support media and materials.

         

         

        Sends detailed damage assessment lists to Procurement team to arrange for replacements.

         

         

        Additional comments

         

         

        Phase 3: Recovery

        Logistics team arranges equipment and supplies transport:

         

        If equipment is to be moved off-site, Logistics team leader secures temperature-controlled moving vans for equipment transfer.

         

         

        Delivers offsite materials

        • Tapes  
        • Incident Response plans and checklists
        • Equipment supplies inventory
        • Equipment operating manuals
        • Support media and materials

         

         

        Transports salvageable furniture and supplies to recovery site.

         

         

        Additional comments

         

         

         

        X. Procurement checklist detail

        Procurement checklist detail - also see KB 97374

         

        Date/Time completed

        Tasks

        Point of contact

        Phase 1: Initial notification and response

        Incident response team leaders prepare teams:

         
        • Contact team members
          • See the IT Service Management tool WiscIT for individual tech staff phone number lookup via drop-down menu "Searches", choose Global, in field Association choose UserInfo, select AllUsers, select Lookup WiscIT Team Member, choose team in the drop-down menu, then display the contact for the person you need, or
          • view the file 8-1-2022 DoIT - Crisis Response Contact Information in the cloud DoIT DR folder on BOX
        • Complete Log of contacts made.
         
         

        Assemble teams at suitable locations.

         
         

        Report situation.

         
         

        Review team responsibilities and functions.

         
         

        Prioritize and direct next actions.

         

        Incident response teams communicate with stakeholders:

         

        Insurance team handles communications with Risk Management.

         
         

        Additional comments:

         

        Phase 2: Assessment and activation

        Procurement team consults with FacilityHardware and Logistics teams during damage assessment:

        Facility team conducts detailed assessment of primary data center structure and utility equipment:

         

        Uses facility damage checklists generated from the Configuration Management Data Base (CMDB) and the Facility Planning & Management facility database.

         
         

        Avoids further damage to the site(s) and utility equipment. Ensures Risk Management has noted damage to equipment before it is moved.

         
         

        Assesses structural damage to data centers.

         
         

        Conducts assessment of individual equipment components to determine which items will be used or salvaged. Consults with Procurement team as necessary.

         
         

        Estimates replacement and repair time including ordering, shipping, installing and testing equipment.

         
         

        Tags all usable equipment.

         
         

        Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company.

         
         

        Reports findings to Executive Management Team.

         
         

        Sends detailed damage assessment lists to Procurement team.

         

        Hardware team conducts detailed assessment of network and infrastructure equipment:

         

        Conducts assessment of individual service components to determine which items will be used or salvaged. Consults with NetworkInfrastructure and Procurement teams as necessary.

         
         

        Estimates replacement and repair time including ordering, shipping, installing and testing equipment.

         
         

        Tags all usable equipment.

         
         

        Once Risk Management has given clearance, separates usable equipment from equipment to be turned over to a salvage company.

         
         

        Reports findings to Executive Management Team

         
         

        Sends detailed damage assessment lists to Procurement team

         

        Logistics team assesses damage to furniture and supplies

         

        Works with Facility team to conduct damage assessments.

         
         

        Assesses damage to supplies including tapes, miscellaneous equipment, support media and materials.

         
         

        Sends detailed damage assessment lists to Procurement team to arrange for replacements.

         
         

        Additional comments

         

        Phase 3: Recovery

        Procurement team purchases replacement hardware:

         

        Receives detailed damage assessment lists from HardwareFacility and Logistics teams.

         
         

        Reviews State of Wisconsin, University System and UW-Madison contract records to determine current contract vendors.

         
         

        Coordinates team member activities to minimize duplication of vendor contacts.

         
         

        Reviews any quick-ship contracts DoIT may have established as a result of individual service component recovery plans.

        • Executes quick-ship delivery per applicable contracts.
         
         

        Divides all equipment for which there is no quick-ship contract into three categories based on the following definitions:

        • Category 1: All equipment purchased new from a vendor who still has some form of purchasing agreement for the same or similar equipment in force with the University or another state agency.
          • Equipment in this category may be purchased quickly because negotiations are not necessary.
          • Similar equipment is an acceptable alternative for the exact model, if the new equipment represents the same vendor's newer model and if no conversion would be necessary.
        • Category 2: Equipment that meets the following two requirements:
          • Requirement 1: Equipment is not readily available from the original vendor, either because the vendor no longer handles it or because that vendor no longer has a valid purchasing agreement in force with the state.
          • Requirement 2: Similar equipment is available from other vendors with purchase authority pre-established by virtue of existing bulletins, contracts, bids, or waivers.
          • Technical evaluation is required to determine what available equipment would make a satisfactory substitute and to ensure that the equipment layout for the recovery site is appropriately modified.
        • Category 3: Equipment that meets either of the following two requirements:
          • Requirement 1: Same or similar equipment is no longer available from any vendor.
          • Requirement 2: Same or similar equipment is only available from vendors that do not have a valid purchasing agreement in force with the state.

        Hardware team leader must seek concurrence of the IT Service Recovery Team Director and the Procurement team leader for the purchase of Category 3 equipment. The negotiation/acquisition process for such equipment also requires that a Governor's waiver be granted.

         
         

        Establishes priorities for acquiring equipment.

         
         

        Contacts vendors to determine availability of equipment, costs, and expected delivery dates.

        • Issues requisitions, specifying pertinent equipment features, ITR #, and bid, bulletin, contract, or waiver number.
        • Reports purchasing actions to Hardware team leader and Administrative Team Director.
         
         

        Category 3 Equipment: Once a vendor has been identified, the Hardware team leader provides Procurement team with detailed list of equipment to be ordered, identifying it as a category 3 purchase. Procurement team then expedites the Governor's waiver process and proceeds with acquisitions after waivers are granted.

         
         

        For equipment purchased on the used market, Procurement team ensures that third-party vendors provide installation contracts and certify equipment to be fully operable and eligible for maintenance agreements.

         
         

        Additional comments

         

        VI. COOP Record Information

        The DoIT COOP relies on expertise in many areas for its content. Source of reference and authority are outlined. Additionally, this section records when COOP updates have been made and the plans for COOP maintenance.

        A. Authorities and References

        FEDERAL AND COLLEGE AUTHORITIES

        The DoIT Continuity of Operations Plan is guided by the following federal and college authorities.

        RESPONSE OPERATIONS REFERENCES

        Additional planning documentation that guide response operations and synergize with the Plan include:

        B. Record of COOP changes

        Table1: Record of COOP Changes
         Revision Date  Description of Change  Implemented by
        Aug 2024

        Updated

        Nov 2023

        Updated

        Feb 2023

        Updates to reflect DoIT Communications style guidelines and general updates

        J. Sutherland
        Oct - Dec 2022

        Updates per DoIT Directors' requests

        • updated COOP Chain of Command
        • updated process for Phase1 triggering COOP activation
        • added additional directors on Emergency Management Team
        • added new role Campus Liaison and backups
        • added clarification on physical security versus Cybersecurity
         J. Sutherland
        Apr 2021- Jun 2021  Draft - Adaptation of existing DoIT COOP Plan to new format.  J. Sutherland
        July 2021  Draft - Revision of format, images, document flow  J. Sutherland
        Aug 2021  Draft - Modified to include Cybersecurity sections. TOC updated.   J. Sutherland
        Sep 15, 2021  Draft - Incorporation of UW System procedure 1031.B. for High Risk Data  J. Sutherland
        Dec 10, 2021  Draft - Conversion to DoIT KnowledgeBase format   J. Sutherland
        Dec 16, 2021  Draft - sections consolidated/re-ordered for flow; expanded detail in Order of Notification.   J. Sutherland
        Jan - Apr, 2022  Draft - Updated links, standardized formats, organization for optimized KB presentation. Incorporated updates from key COOP stakeholders.   J. Sutherland

        C. COOP / Business Continuity Plan Maintenance

        The DoIT COOP / Business Continuity Plan is updated at least annually and as modifications or updates to it are necessary. To ensure that the DoIT Plan reflects the most up-to-date policies, procedures and essential functions, lead staff will conduct the following activities to support Plan maintenance, auditing and exercising: 
        Table2: COOP Maintenance Schedule

        Activity

        Led By

        Frequency

        Review and update the COOP facility location data

        COOP Coordinator

        Annually

        Review and update DoIT staff contact information

        HR and Admin Assistants with COOP Coordinator

        At least 1x annually

        Review and update contact information for response partners, vendors, and continuity facilities.

        COOP Coordinator with Configuration Manager

        Quarterly

        Maintain electronic versions of COOP that DoIT staff can access both onsite/offsite.

        COOP Coordinator

        Annually

         Revise COOP to address any identified gaps following activation or exercise (TTX)

         COOP Coordinator  Upon completion of an exercise or real-world disruption

        COOP Usage History

        Usage Type

        Date

        Scenario or Disruption Type

        Tabletop Exercise

        2023-11-16

        Campus Active Directory  cybersecurity compromise affecting network an all cloud-based IT tools that use single-sign-on (e.g. Zoom, Canvas, Google, Microsoft email/Teams, Webex, etc )

        Tabletop Exercise

        2023-03-07

         Ransomware event encrypting Research Drive data.

        Tabletop Exercise

        2022-04-19

        Data Center flooding from nearby City water main break

        Activation  

        2021-09-10

        Wireless Outage from vendor code issues

        Activation  

        2020-03-12 - 2020-03-23

        COVID-19 Pandemic and initial response transforming UW-Madison to remote learning and business

        Tabletop Exercise

        2019-03-07

        Train Derailment, Toxic Spill, and Resulting Building Evacuation

        Activation 

        2019-01-30

        Campus Closure and EOC activation due to Extreme "Polar Vortex" Weather Conditions

        Tabletop Exercise

        2018-10-24

        Pandemic creating Staff Shortage

        Tabletop Exercise

        2018-05-15

        Participation in "Dark Skies" cyber-attack scenario - a multi-county tabletop exercise (Brown, Calumet, Dane, Fond du Lac, Milwaukee, Outagamie and Winnebago Counties) which tested the abilities of private utilities, law enforcement, first responders and the National Guard to respond to the scenario as well as its second and third order effects.

        Tabletop Exercise

        2018-01-05 

        Long Term Power Outage due to Multi-County Cyber Attack

        Tabletop Exercise

        2016-09 

        Building Fire Alarm triggers Sprinkler Flooding

        Activation 

        2015-09-26 

        Dual Power Feed Outages while UPS generator undergoing maintenance

        Activation 

        2014-06-18 

        Power Outage due to Lightning Strike disabling UPS and Power

        Documentation and artifacts from COOP activation PIRs is located on the DoIT wiki here.

         



        Keywords:
        Business Continuity Operations COOP operational framework itil best practices incident change management itsm problem
        Doc ID:
        91473
        Owned by:
        Jennifer S. in ITSM
        Created:
        2019-05-01
        Updated:
        2025-04-21
        Sites:
        DoITCOOP-internal, DoITStaff-internal, ITSM-internal, SEO-internal, SNCC-internal