UW System Radware Cloud DoS: Service Information
UW System Radware Cloud DoS: Service Information
General
Cloud DoS scrubbing providers use a combination of hardware and software in an attempt to mitigate negative effects of a DoS attack. Networks use BGP to signal subnets through a third party provider to initiate diversion and mitigation. In this document, when to signal for protection will be called "detection".
As of 2025/03/20, the 20,000 foot view of how this service works from a procedure perspective is:
- UW System Network routing team has homegrown detection software that runs in real time. This software consumes IPFIX [netflow] data in 10s increments. Detection software is configured for bits, packets or estimated flows per second at the CIDR level in a 10s window. Detection is not able to flawlessly differentiate between a service that is impaired vs an unusually large yet non service affecting amount of traffic.
- No one is tasked to monitor for DoS attacks 24x7 or in real time. We envision the technical need for scrubbing to be activated one of three ways
- If an attack is large enough that the UW System Network itself is impaired, UW System Network routing team may activate cloud scrubbing without first contacting the affected campus. We would of course aim to communicate status.
- UW System Network routing team happens to notice a possible DoS event. We would seek to make contact with a campus for confirmation of impact and discuss mitigation options as there may be non cloud mitigation options available. We already do this today.
- Like situation 2, but the campus initiates contact with UW System Network
Details below.
UW System Network implementation specifics: updated 2025/03/20
- UW System has contracted with Radware via Internet2 for cloud mitigation.
- Automation between detection and mitigation is not currently planned. Here are a few reasons for this approach
- Once intent to scrub has been signaled to the provider, it can take up to five minutes to activate. However, most DoS attacks have a length shorter than five minutes, frequently two minutes or less. We never know the length of a DoS attack until it has ended, but automatic diversion for most DoS attacks would result in a larger outage than if we took no action at all. This is because both enabling and disabling cloud scrubbing can be service affecting. These limitations are at the global [earth] BGP level as route best paths add/remove and is not due to design issues inside UW System Network.
- With this in mind, UW System Network does not want to scrub an event that is not service affecting. The best technical outcome requires positive confirmation of a technical need to scrub.
- While any customer BGP peering with UW System could initiate scrubbing on their own if we configured the network to support, fiscally, we contract for a certain bandwidth rate of clean return traffic with the provider. We are allowed to burst, but there could be fiscal consequences depending on the scenario. Therefore, at this time, UW System Network engineers are solely responsible for activation.
- If mitigation is effective, our detection engine will believe the DoS has subsided even if it is ongoing as it would no longer have a view of the attack traffic. Once diversion has occurred, cloud portal access is required to know when the attack has subsided and if the removal of a scrub would be safe. There are additional costs for subtenant portal access, a the technical hope/assumption is that due to inherent limitations in on-demand cloud scrubbing it will be rarely enacted and for sustained emergencies only.
- Ongoing development of detection software capabilities and continued refinement of its configuration at the CIDR level is an ongoing collaborative commitment for both customer and service provider in an attempt to minimize false positives and false negatives of detection.
Additional considerations
- Due to the limitations of BGP, IPv4 /24 or IPv6 /48 networks are the smallest subnets that can routed through such a service; you cannot mitigate individual IP addresses.
- Scrubbing a more specific of an existing route [example, scrub a /24 out of a /16] is less impactful than scrubbing an exact matching subnet already in the DFZ. For example, if you announce 123.1.1.0/24 and need to convert to cloud scrubbing, there are likely to be significant impact at the initiation of a scrubbing event. If you announce 123.1.0.0/16 and need to scrub 123.1.1.0/24, traffic will continue to follow the /16 until the more specific /24 is installed.
- Scrubbing may not be complete due to multhoming. For example, UW System peers with various networks that receive only default route and peering routes. Let's say UW Madison wants to scrub 128.104.0.0/24. UW System Network peers with networks A, B and C and sends 128.104.0.0/16 to these peers. Networks A, B and C may not receive a full routing table from their transit providers and may not see the 128.104.0.0/24 route. They will continue to follow the 128.104.0.0/16 peering route, bypassing cloud scrubbing.
- Scrubbing may not be complete because the algorithm may fail. Scrubbing solutions are not magic. There is no 'evil' bit to match in unwanted packets. The Radware cloud solution either magically works or it doesn't. If it doesn't work, UW System WAN engineers are required to contact Radware live for technical support as we cannot tune the cloud service directly. We may need live, real time information and involvement from a campus if this were to occur in an attempt to improve the mitigation outcome.
- Your RPKI stance must match your cloud scrubbing desires. Specifically, if you have created ROAs, check to see if they are exact or variable length. For example, if you have an example ROA for 123.1.0.0/16 exact and want to be able to scrub any /24 in this /16, you will need to convert your ROA to allow masking down to the /24. This makes the ROA less secure but is required for cloud scrubbing to function.
