/contrib/famzah

Enthusiasm never stops


5 Comments

posix_spawn() performance benchmarks and usage examples

The glibc library has an efficient posix_spawn() implementation since glibc version 2.24 (2016-08-05). I have awaited this feature for a long time.

TL;DR: posix_spawn() in glibc 2.24+ is really fast. You should replace the old system() and popen() calls with posix_spawn().

Today I ran all benchmarks of the popen_noshell() library, which basically emulates posix_spawn(). Here are the results:

Test Uses pipes User CPU System CPU Total CPU Slower with
vfork() + exec(), standard Libc No 7.4 1.6 9.0
the new noshell, default clone(), compat=1 Yes 7.7 2.1 9.7 8%
the new noshell, default clone(), compat=0 Yes 7.8 2.0 9.9 9%
posix_spawn() + exec() no pipes, standard Libc No 9.4 2.0 11.5 27%
the new noshell, posix_spawn(), compat=0 Yes 9.6 2.7 12.3 36%
the new noshell, posix_spawn(), compat=1 Yes 9.6 2.7 12.3 37%
fork() + exec(), standard Libc No 40.5 43.8 84.3 836%
the new noshell, debug fork(), compat=1 No 41.6 45.2 86.8 863%
the new noshell, debug fork(), compat=0 No 41.6 45.3 86.9 865%
system(), standard Libc No 67.3 48.1 115.4 1180%
popen(), standard Libc Yes 70.4 47.1 117.5 1204%

The fastest way to run something externally is to call vfork() and immediately exec() after it. This is the best solution if you don’t need to capture the output of the command, nor you need to supply any data to its standard input. As you can see, the standard system() call is about 12 times slower in performing the same operation. The good news is that posix_spawn() + exec() is almost as fast as vfork() + exec(). If we don’t care about the 27% slowdown, we can use the standard posix_spawn() interface.

It gets more complicated and slower if you want to capture the output or send data to stdin. In such a case you have to duplicate stdin/stdout descriptors, close one of the pipe ends, etc. The popen_noshell.c source code gives a full example of all this work.

We can see that the popen_noshell() library is still the fastest option to run an external process and be able to communicate with it. The command popen_noshell() is just 8% slower than the absolute ideal result of a simple vfork() + exec().

There is another good news — posix_spawn() is also very efficient! It’s a fact that it lags with 36% behind the vfork() + exec() marker, but still it’s 12 times faster than the popen() old-school glibc alternative. Using the standard posix_spawn() makes your source code easier to read, better supported for bugs by the mainstream glibc library, and you have no external library dependencies.

The replacement of system() using posix_spawn() is rather easy as we can see in the “popen-noshell/performance_tests/fork-performance.c” function posix_spawn_test():

# the same as system() but using posix_spawn() which is 12 times faster
void posix_spawn_test() {
	pid_t pid;
	char * const argv[] = { "./tiny2" , NULL };

	if (posix_spawn(&pid, "./tiny2", NULL, NULL, argv, environ) != 0) {
		err(EXIT_FAILURE, "posix_spawn()");
	}

	parent_waitpid(pid);
}

If you want to communicate with the external process, there are a few more steps which you need to perform like creating pipes, etc. Have a look at the source code of “popen_noshell.c“. If you search for the string “POPEN_NOSHELL_MODE”, you will find two alternative blocks of code — one for the standard way to start a process and manage pipes in C, and the other block will show how to perform the same steps using the posix_spawn() family functions.

Please note that posix_spawn() is a completely different implementation than system() or popen(). If it’s not safe to use the faster way, posix_spawn() may fall back to the slow fork().

Advertisements