Jumboframes

An overview of jumboframes, which are ethernet datagrams typically up to 9,000 bytes in length.

Quick Overview on Ethernet Frames and usable payload size

Ethernet v2 networks are standardized to carry payloads (such as IPv4 packets) up to 1500 bytes in length. The size of these frames on the wire is up to 1518 bytes, where for IPv4, 1460 bytes are usable for TCP or 1476 for UDP as shown in the diagrams below.

802.1q (trunking) adds 4 bytes for a vlan identifier tag, bringing the frame size to 1522 bytes for tagged frames on trunk links. These 4 byte tags are put on and taken off automatically by switches when frames are sent and received on vlan trunk ports. Standard access switchports don't use these vlan tags. Hosts that have trunking enabled to them add and remove these tags themselves.

For the following discussion, it is handy to know what typical ethernet frames look like.

Ethernet frame format for 


         TCP on IPv4                             UDP on IPv4

-------------------------------       -------------------------------
|  Ethernet Header, 14 bytes  |       |  Ethernet Header, 14 bytes  |
-------------------------------       -------------------------------
|    IPv4 Header, 20 bytes    |       |    IPv4 Header, 20 bytes    |
-------------------------------       -------------------------------
|     TCP Header, 20 bytes    |       |     UDP Header, 8 bytes     |
-------------------------------       -------------------------------
|        TCP Payload          |       |         UDP Payload         |
|             .               |       |               .             |
|             .               |       |               .             |
|             .               |       |               .             |
|      up to 1460 bytes       |       |       up to 1476 bytes      |
-------------------------------       -------------------------------
|    Ethernet CRC, 4 bytes    |       |    Ethernet CRC, 4 bytes    |
-------------------------------       -------------------------------



         TCP on IPv6                             UDP on IPv6

-------------------------------       -------------------------------
|  Ethernet Header, 14 bytes  |       |  Ethernet Header, 14 bytes  |
-------------------------------       -------------------------------
|    IPv6 Header, 40 bytes    |       |    IPv6 Header, 40 bytes    |
-------------------------------       -------------------------------
|     TCP Header, 20 bytes    |       |     UDP Header, 8 bytes     |
-------------------------------       -------------------------------
|        TCP Payload          |       |         UDP Payload         |
|             .               |       |               .             |
|             .               |       |               .             |
|             .               |       |               .             |
|      up to 1440 bytes       |       |       up to 1456 bytes      |
-------------------------------       -------------------------------
|    Ethernet CRC, 4 bytes    |       |    Ethernet CRC, 4 bytes    |
-------------------------------       -------------------------------

So, why is 1500 bytes the "standard"?

1518 byte frames are the standard MTU, or Maximum Transmission Unit for Ethernet v2. This means if a frame is corrupted, only 1518 bytes need to be resent to recover from an error.

This probably seems quite small in this era of 10 Gigabit links and terabyte harddrives, and it is. However, keep in mind that ethernet has been around a long time. Links historically were a lot slower and much less reliable than they are today (with a major exception for wireless). For example, 1500 bytes is approximately 1 second on a 14.4kbit/sec modem.

Plus, there is a lot of baggage. Ethernet's success has in part been attributable to backwards compatibility. You can interconnect 10GE, Gig-E, Fast Ethernet, Ethernet (10Base-T and also 10Base-2 coax). It is important to know that Fast Ethernet (100Mbit/sec) and below do not support jumboframes at all. In addition, many gigabit (typically lower-end) switches do not support jumboframes. Switches and bridges that don't understand jumboframes will silently discard them. Smart hubs (do you remember hubs?!!) may count them as jabbers.

These and other interoperability issues have kept the IEEE from officially standardizing higher MTU sizes.

OK, so now that we're a page deep into this, what exactly is a "Jumboframe"?

Simply put, it is an ethernet frame that can carry a larger than 1500 byte payload, usually 9000 bytes in particular.

Since jumboframes are not officially standardized, 9018 bytes (to support a 9000 byte IP packet) is more or less just the commonly agreed upon size for jumboframes. Actual implementations may be a bit bigger. For example, high-end switches and routers can carry frames up to 9216 bytes in order to carry jumboframes with multiple 802.1q tags, MPLS tags, and other layers of encapsulation.

It gets less feasible to send frames much larger that 9000 bytes because the 32 bit ethernet checksum (CRC) only works reliably for about 12000 bytes.

Support for jumboframes evolved from the advanced networking needs of research and education networks, the Department of Defense, and the Department of Energy.

Performance advantages to using larger frame sizes

Somewhat inverse to the ethernets of yesteryear, networks today run much faster and "cleaner", meaning that the need to retransmit a corrupt ethernet frame is quite rare. As you can see from the packet diagrams above, the usable TCP and UDP payloads in standard 1500 byte frames are quite small.

If the frames were larger, they could carry data more efficiently with less overhead. One 9000 byte jumboframe could replace 6 standard 1500 byte frames. This means for each sending and receiving host there are 5 less packets to be received, the IP and TCP headers processed, checksums checked, and data extracted and copied into memory. To put this into perspective, a full Gig-E link with 1500 byte packets is over 80,000 packets per second (pps). A full Gig-E link with 9000 byte jumboframes is less than 15kpps. The use of jumboframes clearly can alleviate the load on a system's CPU.

A typical use of jumboframes is to enhance the performance of storage access across networks, e.g. NFS, SMB, iSCSI and others. For example, a typical NFS block size is 8192 bytes. Jumboframes can allow a block to fit in one packet and overhead is not needed to chop it up and reassemble into smaller chunks. For NFSv3, blocks can be up to 32k, and for SMB blocks can be up to 64k in size. This savings can lead to measurable gains for typical workloads. Other uses include video, transfers of large scientific data sets, and bulk things like that.

Another use of jumboframes is to enhance TCP throughput across wide areas. TCP's performance is characterized by:

Throughput <= ~0.7 * MSS / (rtt * sqrt(packet_loss))

where MSS is the TCP Maximum Segment Size (payload size) and rtt is the round-trip time of the flow. This means to move data greater distances, it is extremely helpful to support larger TCP segments. Modern operating systems can be tuned (or may even auto-tune) to support large TCP windows. Putting these TCP flows in larger packets allows more tolerance for more delay and particularly packet loss (both of which are out of your control across the Internet at large)

Thoughts against using larger frame sizes

For large frames sizes to work across networks with MTU sizes that do not match, ICMP messages are critically important to communicate to sending hosts if their packets are too large to be transmitted. This process is called Path MTU Discovery (PMTUD). In many environments, PMTUD has been catastrophically broken by years of overly paranoid firewall rules mistakenly or maliciously blocking ICMP packets. This brokenness can cause a lot of pain trying to get packets to flow. Recent standards work has been trying to address this issue with a new approach called Robust Path MTU Discovery, RFC4821.

As mentioned above, switches that do not support jumboframes will (often silently) drop them. Many routers only support fragmenting IP packets in software and not in hardware, and often fragmentation is disabled or rate limited to save on router CPU resources.

TCP has the ability to negotiate the Maximum Segment Size (MSS) of the TCP flows. This helps to negotiate the correct sizes to use between hosts. UDP does no such negotiation (spray and pray), and UDP applications must be smart enough to realize that if jumboframes were enabled, the application will never be notified if there are problems with communication. In reality, UDP applications don't do so well with this.

The problems with the jumboframe backwards compatibility issue and other deployment issues have led to a lot of work elsewhere in PC's to compensate for the performance problems associated with small 1500 byte packets.

Moore's law has caught up and possibly alleviated much of the CPU load problems typically associated with large network data transfers. Modern CPU's can process many more instructions per second and modern PCIe buses can support moving much more data. However, many NAS and other custom embedded hardware type devices have relatively small CPU's and still may suffer problems to some extent.

In addition, many advances have been made in optimizing the performance of network interface cards (NICs). Modern NICs can coalesce interrupts, do their own checksum processing, direct packets from the same flow to the same CPU, and sometimes even fragment and reassemble the TCP segments themselves. This frees up CPU resources for other tasks and quality NICs can probably make quite a bit of difference in virtual machine environments.

Rules for using Jumboframes

  • Identify the specific application use for Jumboframes, and if possible, segment this specialty traffic. One approach would be to use a dedicated subnet / vlan.
  • All hosts on a subnet / vlan must be configured to use the same frame size.
  • All routers on that vlan must also be configured to use the same frame size.
  • All ports on that vlan must be Gigabit ethernet or better, as jumboframes are not supported for slower speeds.
  • All firewalling or packet filtering anywhere along the end to end path, host-based, network-based and otherwise, must not filter ICMP messages, particularly "Destination Unreachable, Datagram Too Big" (type 3, code 4).
  • Other operating system specific tuning measures such as enabling large TCP windows should also be evaluated and enabled on each end system.

Jumboframes on the UW-Madison Campus and off Campus

The UW-Madison campus network is capable of running 9018 byte frames both on gigabit edge ports (if enabled) and across the core. UW System Network also can transit jumboframes across its core to other jumboframe enabled networks.

Various network peers we connect with can support jumboframes such as academic research and education networks, Internet2, national labs and other Department of Energy sites on ESNet, some NASA sites, and other supercomputing centers. Please check to see if the site you need to connect to supports jumboframes end to end.

In the commodity Internet at large, jumboframes are not supported.

To see about enabling jumboframes for your network applications, open a case with the helpdesk to start the dialogue with Network Engineering.

Resources

http://staff.psc.edu/mathis/MTU
http://en.wikipedia.org/wiki/Jumbo_frame
http://en.wikipedia.org/wiki/Maximum_transmission_unit
http://en.wikipedia.org/wiki/Path_MTU_discovery#Problems_with_PMTUD



Keywords:mtu 9000 byte frame jumboframe   Doc ID:9777
Owner:Dale C.Group:Network Services
Created:2009-04-28 19:00 CDTUpdated:2015-02-09 10:46 CDT
Sites:Network Services, Systems Engineering, University of Wisconsin System Network, WiscNet
Feedback:  5   2