Iostream performance in gcc 3.4
Posted 2004-07-21 12:00. Last updated 2004-07-23 12:00. Tagged c++, hack, iostream, performance.
I did a small performance test of gcc 3.4 compared to earlier versions. I think it is safe to say that standard c++ iostreams performance is greatly improved in gcc 3.4.
Update 2004-07-23: Implemented the tests in c as well, as a comparison.
The release notes for gcc 3.4 mentions better performance of the standard c++ library, specifically iostream (streambufs and locale). So I thought I’d do a little performance test and see for myself. Yes, the code from 3.4 is a lot faster.

The test
I used gcc versions 2.96 (yes, the infamous RedHat version),
3.2.2, 3.3.1, and 3.4.1, as provided on the
Linux system at KTH.
Back when gcc 3.x was new, there was a lot of complaints that the
standard IO was a lot slower than in 2.x, to which the general
response was Yes, it’s slower, but it is correct
. By now, it
is both correct (even more so now) and faster than in 2.x!
I did three sample tasks. Each task was run ten times with 1000 000 pseudo random integers as input. The tasks were:
- sum
Reads integers using the extraction operator, adding each to a sum.
- median
Reads integers using the extraction operator, appending each to a `vector` of integers, then sorts the vector and looks in the middle to find the median. Templated to work with any type that can be extracted and compared, but used only with `int`.
- wc
Read characters using the `get(char)` method, looking at each character to count the characters, words, and lines. Similar to the standard unix `wc` command.
Details
Here’s the source of my test program, if you want it:
perfcc.cc
is the actual tests, perfc.c
is
the c version, and gendata.cc
is what I used to
generate input.
And here’s my exact measurements:
perfc296 -s -i 10 5,50s user 0,06s system 100% cpu 5,558 total perfcc296 -s -i 10 9,26s user 0,12s system 100% cpu 9,375 total perfc322 -s -i 10 5,51s user 0,09s system 100% cpu 5,600 total perfcc322 -s -i 10 35,23s user 0,08s system 99% cpu 35,329 total perfc331 -s -i 10 5,23s user 0,06s system 100% cpu 5,280 total perfcc331 -s -i 10 32,26s user 0,08s system 100% cpu 32,334 total perfc341 -s -i 10 5,39s user 0,03s system 100% cpu 5,412 total perfcc341 -s -i 10 5,12s user 0,04s system 100% cpu 5,153 total perfc296 -m -i 10 10,47s user 0,22s system 100% cpu 10,641 total perfcc296 -m -i 10 10,74s user 0,15s system 99% cpu 10,906 total perfc322 -m -i 10 10,53s user 0,26s system 100% cpu 10,743 total perfcc322 -m -i 10 36,82s user 0,23s system 99% cpu 37,083 total perfc331 -m -i 10 10,15s user 0,26s system 100% cpu 10,361 total perfcc331 -m -i 10 34,33s user 0,30s system 99% cpu 34,659 total perfc341 -m -i 10 10,17s user 0,25s system 100% cpu 10,395 total perfcc341 -m -i 10 6,34s user 0,27s system 100% cpu 6,603 total perfc296 -w -i 10 10,04s user 0,04s system 100% cpu 10,072 total perfcc296 -w -i 10 10,90s user 0,07s system 100% cpu 10,965 total perfc322 -w -i 10 10,00s user 0,05s system 100% cpu 10,040 total perfcc322 -w -i 10 2,81s user 0,04s system 100% cpu 2,847 total perfc331 -w -i 10 9,64s user 0,03s system 100% cpu 9,641 total perfcc331 -w -i 10 2,82s user 0,04s system 100% cpu 2,857 total perfc341 -w -i 10 9,42s user 0,08s system 100% cpu 9,474 total perfcc341 -w -i 10 2,63s user 0,06s system 100% cpu 2,686 total
Since that the times for the c version is rather constant, I only included one of them (gcc 3.4.1) in the diagram above.
Comments
Write a comment