Iostream performance in gcc 3.4
Posted 2004-07-21 12:00. Last updated 2004-07-23 12:00. Tagged c++, hack, iostream, performance.
I did a small performance test of gcc 3.4 compared to earlier versions. I think it is safe to say that standard c++ iostreams performance is greatly improved in gcc 3.4.
Update 2004-07-23: Implemented the tests in c as well, as a comparison.
The release notes for gcc 3.4 mentions better performance of the standard c++ library, specifically iostream (streambufs and locale). So I thought I’d do a little performance test and see for myself. Yes, the code from 3.4 is a lot faster.
I used gcc versions 2.96 (yes, the infamous RedHat version),
3.2.2, 3.3.1, and 3.4.1, as provided on the
Linux system at KTH.
Back when gcc 3.x was new, there was a lot of complaints that the
standard IO was a lot slower than in 2.x, to which the general
Yes, it’s slower, but it is correct. By now, it
is both correct (even more so now) and faster than in 2.x!
I did three sample tasks. Each task was run ten times with 1000 000 pseudo random integers as input. The tasks were:
Reads integers using the extraction operator, adding each to a sum.
Reads integers using the extraction operator, appending each to a `vector` of integers, then sorts the vector and looks in the middle to find the median. Templated to work with any type that can be extracted and compared, but used only with `int`.
Read characters using the `get(char)` method, looking at each character to count the characters, words, and lines. Similar to the standard unix `wc` command.
Here’s the source of my test program, if you want it:
perfcc.cc is the actual tests,
the c version, and
gendata.cc is what I used to
And here’s my exact measurements:
perfc296 -s -i 10 5,50s user 0,06s system 100% cpu 5,558 total perfcc296 -s -i 10 9,26s user 0,12s system 100% cpu 9,375 total perfc322 -s -i 10 5,51s user 0,09s system 100% cpu 5,600 total perfcc322 -s -i 10 35,23s user 0,08s system 99% cpu 35,329 total perfc331 -s -i 10 5,23s user 0,06s system 100% cpu 5,280 total perfcc331 -s -i 10 32,26s user 0,08s system 100% cpu 32,334 total perfc341 -s -i 10 5,39s user 0,03s system 100% cpu 5,412 total perfcc341 -s -i 10 5,12s user 0,04s system 100% cpu 5,153 total perfc296 -m -i 10 10,47s user 0,22s system 100% cpu 10,641 total perfcc296 -m -i 10 10,74s user 0,15s system 99% cpu 10,906 total perfc322 -m -i 10 10,53s user 0,26s system 100% cpu 10,743 total perfcc322 -m -i 10 36,82s user 0,23s system 99% cpu 37,083 total perfc331 -m -i 10 10,15s user 0,26s system 100% cpu 10,361 total perfcc331 -m -i 10 34,33s user 0,30s system 99% cpu 34,659 total perfc341 -m -i 10 10,17s user 0,25s system 100% cpu 10,395 total perfcc341 -m -i 10 6,34s user 0,27s system 100% cpu 6,603 total perfc296 -w -i 10 10,04s user 0,04s system 100% cpu 10,072 total perfcc296 -w -i 10 10,90s user 0,07s system 100% cpu 10,965 total perfc322 -w -i 10 10,00s user 0,05s system 100% cpu 10,040 total perfcc322 -w -i 10 2,81s user 0,04s system 100% cpu 2,847 total perfc331 -w -i 10 9,64s user 0,03s system 100% cpu 9,641 total perfcc331 -w -i 10 2,82s user 0,04s system 100% cpu 2,857 total perfc341 -w -i 10 9,42s user 0,08s system 100% cpu 9,474 total perfcc341 -w -i 10 2,63s user 0,06s system 100% cpu 2,686 total
Since that the times for the c version is rather constant, I only included one of them (gcc 3.4.1) in the diagram above.