Iostream performance in gcc 3.4

Publicerad taggat , , , .

I did a small performance test of gcc 3.4 compared to earlier versions. I think it is safe to say that standard c++ iostreams performance is greatly improved in gcc 3.4.

Uppdatering : Implemented the tests in c as well, as a comparison.

The release notes for gcc 3.4 mentions better performance of the standard c++ library, specifically iostream (streambufs and locale). So I thought I'd do a little performance test and see for myself. Yes, the code from 3.4 is a lot faster.

Time values (seconds) for different tasks of same c++ program (and a c version) compiled with different versions of gcc.

The test

I used gcc versions 2.96 (yes, the infamous RedHat version), 3.2.2, 3.3.1, and 3.4.1, as provided on the Linux system at KTH. Back when gcc 3.x was new, there was a lot of complaints that the standard IO was a lot slower than in 2.x, to which the general response was Yes, it's slower, but it is correct. By now, it is both correct (even more so now) and faster than in 2.x!

I did three sample tasks. Each task was run ten times with 1000 000 pseudo random integers as input. The tasks were:

sum
  • Reads integers using the extraction operator, adding each to a sum.

  • median
  • Reads integers using the extraction operator, appending each to a vector of integers, then sorts the vector and looks in the middle to find the median. Templated to work with any type that can be extracted and compared, but used only with int.

  • wc
  • Read characters using the get(char) method, looking at each character to count the characters, words, and lines. Similar to the standard unix wc command.

  • Details

    Here's the source of my test program, if you want it: perfcc.cc is the actual tests, perfc.c is the c version, and gendata.cc is what I used to generate input.

    And here's my exact measurements:

    perfc296  -s -i 10   5,50s user 0,06s system 100% cpu  5,558 total
    perfcc296 -s -i 10   9,26s user 0,12s system 100% cpu  9,375 total
    perfc322  -s -i 10   5,51s user 0,09s system 100% cpu  5,600 total
    perfcc322 -s -i 10  35,23s user 0,08s system  99% cpu 35,329 total
    perfc331  -s -i 10   5,23s user 0,06s system 100% cpu  5,280 total
    perfcc331 -s -i 10  32,26s user 0,08s system 100% cpu 32,334 total
    perfc341  -s -i 10   5,39s user 0,03s system 100% cpu  5,412 total
    perfcc341 -s -i 10   5,12s user 0,04s system 100% cpu  5,153 total
    
    perfc296  -m -i 10  10,47s user 0,22s system 100% cpu 10,641 total
    perfcc296 -m -i 10  10,74s user 0,15s system  99% cpu 10,906 total
    perfc322  -m -i 10  10,53s user 0,26s system 100% cpu 10,743 total
    perfcc322 -m -i 10  36,82s user 0,23s system  99% cpu 37,083 total
    perfc331  -m -i 10  10,15s user 0,26s system 100% cpu 10,361 total
    perfcc331 -m -i 10  34,33s user 0,30s system  99% cpu 34,659 total
    perfc341  -m -i 10  10,17s user 0,25s system 100% cpu 10,395 total
    perfcc341 -m -i 10   6,34s user 0,27s system 100% cpu  6,603 total
    
    perfc296  -w -i 10  10,04s user 0,04s system 100% cpu 10,072 total
    perfcc296 -w -i 10  10,90s user 0,07s system 100% cpu 10,965 total
    perfc322  -w -i 10  10,00s user 0,05s system 100% cpu 10,040 total
    perfcc322 -w -i 10   2,81s user 0,04s system 100% cpu  2,847 total
    perfc331  -w -i 10   9,64s user 0,03s system 100% cpu  9,641 total
    perfcc331 -w -i 10   2,82s user 0,04s system 100% cpu  2,857 total
    perfc341  -w -i 10   9,42s user 0,08s system 100% cpu  9,474 total
    perfcc341 -w -i 10   2,63s user 0,06s system 100% cpu  2,686 total
        

    Since that the times for the c version is rather constant, I only included one of them (gcc 3.4.1) in the diagram above.

    Rasmus Kaj rasmus@krats.se

    Skriv en kommentar

    Din epostadress kommer inte att visas. Du kan inte använda markup i kommentaren, men en dubbel radmating blir en styckesbrytning.