/contrib/famzah

Enthusiasm never stops


11 Comments

A much faster popen() and system() implementation for Linux

This project is now hosted on GitHub: https://github.com/famzah/popen-noshell


Problem definition
As we already discussed it, fork() is slow. What do we do if we want to make many popen() calls and still spend less money on hardware?

The parent process calling the popen() function communicates with the child process by reading its standard output. Therefore, we cannot use vfork() to speed things up, because it doesn’t allow the child process to close its standard output and duplicate the passed file descriptors from the parent to its standard output before exec()’uting the command. A child process created by vfork() can only call exec() right away, nothing more.

If we used threads to re-implement popen(), because the creation of a thread is very light-weight, we couldn’t then use exec(), because invoking exec() from a thread terminates the execution of all other threads, including the parent one.

Problem resolution
We need a fork mechanism which is similar to threads and vfork() but still allows us to execute commands other than just exec().

The system call clone() comes to the rescue. Using clone() we create a child process which has the following features:

  • The child runs in the same memory space as the parent. This means that no memory structures are copied when the child process is created. As a result of this, any change to any non-stack variable made by the child is visible by the parent process. This is similar to threads, and therefore completely different from fork(), and also very dangerous – we don’t want the child to mess up the parent.
  • The child starts from an entry function which is being called right after the child was created. This is like threads, and unlike fork().
  • The child has a separate stack space which is similar to threads and fork(), but entirely different to vfork().
  • The most important: This thread-like child process can call exec().

In a nutshell, by calling clone in the following way, we create a child process which is very similar to a thread but still can call exec():

pid = clone(fn, stack_aligned, CLONE_VM | SIGCHLD, arg);

The child starts at the function fn(arg). We have allocated some memory for the stack which must be aligned. There are some important notes (valid at the time being) which I learned by reading the source of libc and the Linux kernel:

  • On all supported Linux platforms the stack grows down, except for HP-PARISC. You can grep the kernel source for “STACK_GROWSUP”, in order to get this information.
  • On all supported platforms by GNU libc, the stack is aligned to 16 bytes, except for the SuperH platform which is aligned to 8 bytes. You can grep the glibc source for “STACK_ALIGN”, in order to get this information.

Note that this trick is tested only on Linux. I failed to make it work on FreeBSD.

Usage
Once we have this child process created, we carefully watch not to touch any global variables of the parent process, do some file descriptor magic, in order to be able to bind the standard output of the child process to a file descriptor at the parent, and execute the given command with its arguments.

You will find detailed examples and use-cases in the source code. A very simplified example follows with no error checks:

fp = popen_noshell("ls", (const char * const *)argv, "r", &pclose_arg, 0);
while (fgets(buf, sizeof(buf)-1, fp)) {
    printf("Got line: %s", buf);
}
status = pclose_noshell(&pclose_arg);

There is a more compatible version of popen_noshell() which accepts the command and its arguments as one whole string, but its usage is discouraged, because it tries to very naively emulate simple shell arguments interpretation.

Benchmark results
I’ve done several tests on how fast is popen_noshell() compared to popen() and even a bare fork()+exec(). All the results are similar and therefore I’m publishing only one of the benchmark results:
Tested functions on Linux - popen_noshell(), fork(), vfork(), popen(), system()


Here are the resources which you can download:

I will appreciate any comments on the library.

Advertisements