performance | /contrib/famzah

Version 1.03d ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP bifferboard.lo 300M 822 96 5524 61 4305 42 855 99 16576 67 143.2 12 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 1274 94 14830 100 1965 85 1235 90 22519 100 2015 87 bifferboard.localdomain,300M,822,96,5524,61,4305,42,855,99,16576,67,143.2,12,16,1274,94,14830,100,1965,85,1235,90,22519,100,2015,87

This article is about measuring network throughput and performance using a Linux box, in a very approximate way.

There are many network performance benchmarks and stress-tests which measure the maximum bandwidth transfer rate of a device – that is how many bytes-per-second or bits-per-second a device can handle. Good examples for such benchmark tests are NetPerf, Iperf, or even multiple Wget downloads which run simultaneously and save the downloaded files to /dev/null, so that we eliminate the hard disk throughput of the local machine.

There is however another very important metric for the throughput of a network device – its packets-per-second (pps) maximum rate.

Today I struggled a lot to find out a tool which measures the packets-per-second throughput of a network device and couldn’t find a suitable one. Therefore, I tried two different methods both using the ICMP echo protocol, in order to approximately measure the packets-per-second metric of a remote network device which is directly attached to my network.

Method #1: The old-school “ping“. Execute the following as root:

ping -q -s 1 -f 192.168.100.101

This floods the destination “192.168.100.101” with the smallest possible PING packets. Here is a sample output:

--- 192.168.100.101 ping statistics --- 20551 packets transmitted, 20551 received, 0% packet loss, time 6732ms

You can calculate the packets-per-second rate by dividing the count of the packets to the elapsed time:

20551 packets / 6.732 secs ~= 3052 packets-per-second

Note that this is the count of the replies at 0% packet loss. Which means that the same count requests were sent too. So the final result is:

3052 x 2 = 6104 packets-per-second

Note that if you run multiple “ping” commands simultaneously, you will get more accurate results. You need to sum the average values from every “ping” command, in order to calculate the overall packets-per-second rate. For example, when I performed the measurement against the very same host but using two or three simultaneous “ping” commands, the average packets-per-second value was settled to 8000.

Method #2: The much faster and extended “hping3“. Execute the following:

taurus:~# time hping3 192.168.100.101 -q -i u20 --icmp|tail -n10

This will bombard the host “192.168.100.101” with ICMP echo requests. After a few seconds, you can interrupt the ping by pressing CTRL+C. Here is the output:

--- 192.168.100.101 hping statistic --- 67932 packets transmitted, 10818 packets received, 85% packet loss round-trip min/avg/max = 0.5/7.8/15.6 ms
real 0m2.635s user 0m0.368s sys 0m1.160s

If the packet loss is 0%, decrease “-i u20” to something smaller like “-i u15” or less. Make sure to not decrease it too much, or you may temporarily lose connectivity to the device (because of switch problems?).

The calculations for the packets-per-second value are the same as for “ping” above. You need to divide the received packets to the time:

10818 packets / 2.635 secs ~= 4105 packets-per-second

Because the calculated packets-per-second represent the rate of successful replies, we need to multiply it by two, in order to get the actual packets-per-second, as the count of the replies is the same as the count of the successfully received requests. The final result is:

4105 x 2 = 8210 packets-per-second

The calculated value by “hping3” is the same as the one by “ping”. We are either correct, or both methods failed… 🙂

Some notes which apply for both methods:

You should run multiple instances of the ping commands, either from the same or from different machines, in order to be able to properly saturate the connection of the machine which is being tested.
The machines from which you test have a hardware rate limit too. If the host being tested has a similar rate limit, you surely need to run the ping commands from more than a single machine.
You should run more and more simultaneous ping commands until you start to receive similar results for the calculated packets-per-second rate. This indicates a saturation of the network bandwidth, usually at the remote host which is being tested, but may also indicate a saturation somewhere in the route between your tester machines and the tested host…
The ethernet switches also have a hardware limit of the bandwidth and the packets-per-second. This may influence the overall results.
If the tested device has any firewall rules, you should temporarily remove them, because they may slow down the packets processing.
If the tested device is a Linux box, you should temporarily remove the ICMP rate limits by executing “sysctl net.ipv4.icmp_ratelimit=0” and “sysctl net.ipv4.icmp_ratemask=0”.
If the tested device is a router, a better way to test its packets-per-second maximum rate is to flood one of its interfaces and count the output rate at the other one, assuming that you are actually flooding a host which is behind the router according to its routing tables.

Thanks to zImage for his usual willingness to help people out by giving good tips and ideas.

/contrib/famzah

Enthusiasm never stops

Tag Archives: performance

Bifferboard performance benchmarks

Benchmark the packets-per-second performance of a network device