java | /contrib/famzah

September 10, 2016
by Ivan Zahariev 52 Comments

C++ vs. Python vs. PHP vs. Java vs. Others performance benchmark (2016 Q3)

The benchmarks here do not try to be complete, as they are showing the performance of the languages in one aspect, and mainly: loops, dynamic arrays with numbers, basic math operations.

This is an improved redo of the tests done in previous years. You are strongly encouraged to read the additional information about the tests in the article.

Here are the benchmark results:

Language	CPU time			Slower than		Language version	Source code
Language	User	System	Total	C++	previous	Language version	Source code
C++ (optimized with -O2)	0.899	0.053	0.951	–	–	g++ 6.1.1	link
Rust	0.898	0.129	1.026	7%	7%	1.12.0	link
Java 8 (non-std lib)	1.090	0.006	1.096	15%	6%	1.8.0_102	link
Python 2.7 + PyPy	1.376	0.120	1.496	57%	36%	PyPy 5.4.1	link
C# .NET Core Linux	1.583	0.112	1.695	78%	13%	1.0.0-preview2	link
Javascript (nodejs)	1.371	0.466	1.837	93%	8%	4.3.1	link
Go	2.622	0.083	2.705	184%	47%	1.7.1	link
C++ (not optimized)	2.921	0.054	2.975	212%	9%	g++ 6.1.1	link
PHP 7.0	6.447	0.178	6.624	596%	122%	7.0.11	link
Java 8 (see notes)	12.064	0.080	12.144	1176%	83%	1.8.0_102	link
Ruby	12.742	0.230	12.972	1263%	6%	2.3.1	link
Python 3.5	17.950	0.126	18.077	1800%	39%	3.5.2	link
Perl	25.054	0.014	25.068	2535%	38%	5.24.1	link
Python 2.7	25.219	0.114	25.333	2562%	1%	2.7.12	link

The big difference this time is that we use a slightly modified benchmark method. Programs are no longer limited to just 10 loops. Instead they run for 90 wall-clock seconds, and then we divide and normalize their performance as if they were running for only 10 loops. This way we can compare with the previous results. The benefit of doing the tests like this is that the startup and shutdown times of the interpreters should make almost no difference now. It turned out that the new method doesn’t significantly change the outcome compared to the previous benchmark runs, which is good as the old way of benchmarks seems also correct.

For the curious readers, the raw results also show the maximum used memory (RSS).

Brief analysis of the results:

Rust, which we benchmark for the first time, is very fast. 🙂
C# .NET Core on Linux, which we also benchmark for the first time, performs very well by being as fast as NodeJS and only 78% slower than C++. Memory usage peak was at 230 MB which is the same as Python 3.5 and PHP 7.0, and two times less than Java 8 and NodeJS.
NodeJS version 4.3.x ~~got much slower than the previous major version 4.2.x. This is the only surprise.~~ It turned out to be a minor glitch in the parser which was easy to fix. NodeJS 4.3.x is performing the same as 4.2.x.
Python and Perl seem a bit slower than before but this is probably due to the fact that C++ performed even better because of the new benchmark method.
Java 8 didn’t perform much faster as we expected. Maybe it gets slower as more and more loops are done, which also allocated more RAM.
Also review the analysis in the old 2016 tests for more information.

The tests were run on a Debian Linux 64-bit machine.

You can download the source codes, raw results, and the benchmark batch script at:
https://github.com/famzah/langs-performance

Update @ 2016-10-15: Added the Rust implementation. The minor versions of some languages were updated as well.
Update @ 2016-10-19: A redo which includes the NodeJS fix.
Update @ 2016-11-04: Added the C# .NET Core implementation.

February 9, 2016
by Ivan Zahariev 47 Comments

C++ vs. Python vs. Perl vs. PHP performance benchmark (2016)

—

There are newer benchmarks: C++ vs. Python vs. PHP vs. Java vs. Others performance benchmark (2016 Q3)

—

The benchmarks here do not try to be complete, as they are showing the performance of the languages in one aspect, and mainly: loops, dynamic arrays with numbers, basic math operations.

This is a redo of the tests done in previous years. You are strongly encouraged to read the additional information about the tests in the article.

Here are the benchmark results:

Language	CPU time			Slower than		Language version	Source code
Language	User	System	Total	C++	previous	Language version	Source code
C++ (optimized with -O2)	0.952	0.172	1.124	–	–	g++ 5.3.1	link
Java 8 (non-std lib)	1.332	0.096	1.428	27%	27%	1.8.0_72	link
Python 2.7 + PyPy	1.560	0.160	1.720	53%	20%	PyPy 4.0.1	link
Javascript (nodejs)	1.524	0.516	2.040	81%	19%	4.2.6	link
C++ (not optimized)	2.988	0.168	3.156	181%	55%	g++ 5.3.1	link
PHP 7.0	6.524	0.184	6.708	497%	113%	7.0.2	link
Java 8	14.616	0.908	15.524	1281%	131%	1.8.0_72	link
Python 3.5	18.656	0.348	19.004	1591%	22%	3.5.1	link
Python 2.7	20.776	0.336	21.112	1778%	11%	2.7.11	link
Perl	25.044	0.236	25.280	2149%	20%	5.22.1	link
PHP 5.6	66.444	2.340	68.784	6020%	172%	5.6.17	link

The clear winner among the script languages is… PHP 7. 🙂

Yes, that’s not a mistake. Apparently the PHP team did a great job! The rumor that PHP 7 is really fast confirmed for this particular benchmark test. You can also review the PHP 7 infographic by the Zend Performance Team.

Brief analysis of the results:

NodeJS got almost 2x faster.
Java 8 seems almost 2x slower.
Python has no significant change in the performance. Every new release is a little bit faster but overall Python is steadily 15x slower than C++.
Perl has the same trend as Python and is steadily 22x slower than C++.
PHP 5.x is the slowest with results between 47x to 60x behind C++.
PHP 7 made the big surprise. It is about 10x faster than PHP 5.x, and about 3x faster than Python which is the next fastest script language.

The tests were run on a Debian Linux 64-bit machine.

You can download the source codes, an Excel results sheet, and the benchmark batch script at:
https://github.com/famzah/langs-performance

May 2, 2014
by Ivan Zahariev Leave a comment

Google App Engine Datastore benchmark

I admire the idea of Google App Engine — a platform as a service where there is “no worrying about DBAs, servers, sharding, and load balancers”. And you can “auto scale to 7 billion requests per day”. I wanted to try the App Engine for a pet project where I had to collect, process and query a huge amount of time series. The fact that I needed to do fast queries over tens of 1000’s of records however made me wonder if the App Engine Datastore would be fast enough. Note that in order to reduce the amount of entities which are fetched from the database, couples of data entries are consolidated into a single database entity. This however imposes another limitation — fetching big data entities uses more memory on the running instance.

My language of choice is Java, because its performance for such computations is great. I am using the the Objectify interface (version 4.0rc2), which is also one of the recommended APIs for the Datastore.

Unfortunately, my tests show that the App Engine is not suitable for querying of such amount of data. For example, fetching and updating 1000 entries takes 1.5 seconds, and additionally uses a lot of memory on the F1 instance. You can review the Excel sheet file below for more detailed results.

Basically each benchmark test performs the following operations and then exits:

Adds a bunch of entries.
Gets those entries from the database and verifies them.
Updates those entries in the database.
Gets the entries again from the database and verifies them.
Deletes the entries.

All Datastore operations are performed in a batch and thus in an asynchronous parallel way. Furthermore, no indexes are used but the entities are referenced directly by their key, which is the most efficient way to query the Datastore. The tests were performed at two separate days because I wanted to extend some of the tests. This is indicated in the results. A single warmup request was made before the benchmarks, so that the App Engine could pre-load our application.

The first observation is that using the default F1 instance once we start fetching more than 100 entities or once we start to add/update/delete more than 1000 entities, we saturate the Datastore -> Objectify -> Java throughput and don’t scale any more:

The other interesting observation is that the Datastore -> Objectify -> Java throughput depends a lot on the App Engine instance. That’s not a surprising fact because the application needs to serialize data back and forth when communicating with the Datastore. This requires CPU power. The following two charts show that more CPU power speeds up all operations where serializing is involved — that is all Datastore operations but the Delete one which only queries the Datastore by supplying the keys of the entities, no data:

It is unexpected that the App Engine and the Datastore still have good and bad days. Their latency as well as CPU accounting could fluctuate a lot. The following chart shows the benchmark results which we got using an F1 instance. If you compare this to the chart above where a much more expensive F4 instance was used, you’ll notice that the 4-times cheaper F1 instance performed almost as fast as an F4 instance:

The source code and the raw results are available for download at http://www.famzah.net/download/gae-datastore-performance/

/contrib/famzah

Enthusiasm never stops

Tag Archives: java

C++ vs. Python vs. PHP vs. Java vs. Others performance benchmark (2016 Q3)

C++ vs. Python vs. Perl vs. PHP performance benchmark (2016)

Google App Engine Datastore benchmark