I am running the spiffy new UltraSPARC clients (deschall-sunos5-ultra)
which use the bitslicing and split 64-bit register techniques. I
understand the method and reasoning behind this technique, and why the
high half may be less stable due to lame 32-bit register saving in
Solaris. However, I noticed the odd behavior of the clients which use
this technique:
Processor 1A -- 2^30 complementary pairs of keys starting with 150D525401010101
Processor 1B -- 2^30 complementary pairs of keys starting with 150D525701010101
<snip>
Processor 1A -- Elapsed time: 1777.4 seconds (1208k keys/sec)
Processor 1B -- Elapsed time: 1777.4 seconds (1208k keys/sec)
<snip>
Processor 1A -- 2^30 complementary pairs of keys starting with 150B451F01010101
Processor 1B -- 2^31 complementary pairs of keys starting with 150B452001010101
<snip>
Processor 1A -- Elapsed time: 2796.8 seconds (768k keys/sec)
Processor 1B -- Elapsed time: 2796.8 seconds (1536k keys/sec)
The two halves seem to work in parallel for a while, then side B seems to
take larger blocks while side A remains at 2^30 blocks at a time. After
this, side A seems to take over some of the work side B gets credit for,
since the overall keyrate is roughly the same. Several machines show this
behavior.
Is this caused by a context switch triggering a register save and
corrupting the A side? If the two sides get differently sized blocks,
will B give A more to do once A has searched its assigned block?
I'd like a bit more technical insight into how this register-splitting
technique works. Is this a trade secret, Darrell? If so, any tidbits of
info would be appreciated. Excellent work.
Just curious...
Andy Brown