jumbo frames | /contrib/famzah

While working on my latest pet project which involved 10 GigE transfers, I noticed a significant difference between the results shown by “iperf” and “iftop“. A fellow blogger also noticed this discrepancy. In order to get to the bottom of this, I did some additional tests using different MTU sizes, and observing the output of “iperf”, “iftop”, “iptraf”, and the raw Linux network device counters as seen by “ifconfig”.

The tests results are summarized in an online spreadsheet: https://goo.gl/MvJC8K

Some notes about each application:

iperf – this tool measures the TCP performance, as per documentation; therefore it counts the useful payload in a TCP/IP transfer; this is layer4 in the OSI model
iftop – this tool counts all IP packets, as per documentation; my tests show that it also operates on layer4, just as “iperf”, because ARP traffic (on layer3) is not counted at all; the fact that “iftop” cares about connections+ports also suggests that it operates at layer4
iptraf – this tool seems to be too old now, and its results were off by a multiple of 4 to 5
ifconfig – shows the most low-level statistics, namely bytes that passed as RX or TX through the network device; the most trusted source of performance data

We notice that both “iperf” and “iftop” measure the useful payload data that we can transfer per second. Since all OSI layers have some overhead, let’s take a look at what theory says about bandwidth efficiency in Ethernet:

with a standard MTU frame of 1500 bytes, we get 94.93% efficiency (5.07% overhead)
with a jumbo MTU frame of 9000 bytes, we get 99.14% efficiency (0.86% overhead)

Those numbers correspond very closely with the results shown by “iperf”.

It’s only “iftop” which differs a lot. Analysis of its source code reveals the reason for this and how we must interpret the displayed results:

#
# ui.c
#

void ui_print() {
...
    mvaddstr(y, COLS - 8 * HISTORY_DIVISIONS - 8, "rates:");

    draw_totals(&totals);
}

void draw_totals(host_pair_line* totals) {
    for(j = 0; j < HISTORY_DIVISIONS; j++) {
        readable_size((totals->sent[j] + totals->recv[j]) , buf, 10, 1024, options.bandwidth_in_bytes);
...
}

#
# ui_common.c
#

/*
 * Format a data size in human-readable format
 */
void readable_size(float n, char* buf, int bsize, int ksize, int bytes) {
    float size = 1;
...
    while(1) {
      size *= ksize;
...
        snprintf(buf, bsize, " %4.2f%s", n / size, bytes ? unit_bytes[i] : unit_bits[i]);

The authors of “iftop” decided to round to Gigibit (multiple of 1024), instead of the more common Gigabit (multiple of 1000). This makes the difference by “iftop” bigger as the transfer rate gets higher. For Gigabit the difference is 7%.

Once the “iftop” values are converted from Gigibit to Gigabit, they also match the results by “iperf” and the raw Linux network device counters.

/contrib/famzah

Enthusiasm never stops

Tag Archives: jumbo frames

“iperf” and “iftop” accuracy