Okay, here goes a bit of the description of what I have gone through to
get what I consider to be optimal compiles on the Alpha/MIPS/Sparc
machines.
Alpha,
Each compile has been tested for speed on early versions of the 21064 and
21164 chips,
Compiles tested:
cc {-xO4,-xO5} { ,-tune host,-tune ev5}
gcc {-O2,-O3}
Winner gcc -O3
Mips,
Compiles were tested on R10000, R5000 machines
Compiles tested:
cc {-O2,-O3} {-mips,-mips2,-mips3,-mips4} {-o32,-n32,-64}
gcc {-O2,-O3}
winner: cc -O3 -mips -o32
Sparc,
Compiles were tested on an ELC, SLC, Sparc 10, Sparc 20,
Sparc 20(Ross Hyperc Sparc),UltraSPARC
cc {-x03,-xO4,-xO5} {a slew of -xtarget= and -xchip=} {-fast} {-xautopar}
gcc {-O2,-O3} {-msupersparc,-mv7,-mv8}
Winner: gcc -O3 -msupersparc
------
The weird thing here is that I would have expected there to be some
deviation in compiling for one architecture and running it on other ones.
But I don't rember a single case where If I improved performance on any
one platform that another platform got worse. I.E. the supersparc option
didn't degrade performance on a non supersparc platform.
Also, having looked at the code, looked at the assembly output, the stuff
from the compiler looks tight. Maybe a bit of a wiggle here and there, but
very tight. These RISC architectures are very well tuned for compilers,
unlike the x86 platform. Still looking at improving the other platforms,
but the most likely improvement will be to make use of the 64-bittedness
of the 64-bit platforms, non of which is being done right now.
Enjoy.
--
Guy Albertelli II albertel@pilot.msu.edu | "And God rested, chuckling at
http://www.cis.ohio-state.edu/~albertel | His own little play on words"
--------------------------------------------------------------------------
Does my quiet self-pity get to me? Yes? Or should I move up
to incessant nagging?