You are doing a lot of data operations on your RRD files (create, update, fetch, last), and every update is done by a separate Perl process which lives a very short time – the process is launched, it updates or reads the data, does something else, and then exits.
If you are using RRDtool and Perl as described, you surely have noticed that running many of these processes wastes a lot of CPU resources. The question is – can we do some performance optimizations, and lessen the performance hit of loading the RRDs library into Perl? We know that launching often Perl itself is quite expensive, but after all, if we chose to work with Perl, this is a price we should be ready to pay.
The RRDtool shared library is a monolithic piece of code which provides ALL functions of the RRDtool suite – data manipulation, graphics and import/export tools. The last two components bring huge dependencies in regards to other shared libraries. The library from RRDtool version 1.4.4 depends on 34 other libraries on my Linux box! This must add up to the loading time of the RRDtool library into Perl.
Resolution and benchmarks
In order to prove my theory (actually, it was more a theory of zImage, and I just followed, enhanced and tried it), I commented out the implementation of the “graphics” and “import/export tools” modules from the source code of RRDtool. Then I re-compiled the library and did some performance benchmarks. I also re-implemented the RRDs.pm module by replacing the DynaLoader module with the XSLoader one. This made no difference in performance whatsoever. The re-compiled RRD library depends on only 4 other libraries – linux-gate.so.1, libm.so.6, libc.so.6, and /lib/ld-linux.so.2. I think this is the most we can cut down.
So here are the benchmark results. They show the accumulated time for 1000 invocations of the Perl interpreter with three different configurations:
- Only Perl (baseline): 5.454s.
- With RRDs, no graphics or import/export functions: 9.744s (+4.290s) +78%.
- With standard RRDs: 11.647s (+6.192s) +113%.
As you can see, you can make Perl + RRDs start 35% faster. The speed up for RRDs itself is 44%.
Here are the commands I used for the benchmarks:
- Only Perl (baseline): time ( i=1000 ; while [ "$i" -gt 0 ]; do perl -Mwarnings -Mstrict -e ” ; i=$(($i-1)); done )
- Perl + RRDs: time ( i=1000 ; while [ "$i" -gt 0 ]; do perl -Mwarnings -Mstrict -MRRDs -e ” ; i=$(($i-1)); done )