Computer Flake Outs (was: Re: On Overclocking - READ THIS!)

Rodney R. Korte (rrk102@psu.edu)
Tue, 06 May 97 23:58:47 -0400


On Tue, 6 May 1997 22:57:55 -0400, Nelson Minar wrote:

[...]

>The big question is whether those errors will be significant to the
>computation. It is worth thinking about what random failures mean to a
>huge computation like deschall. Other parts of computers flake out:
>memories fail (especially if you don't have parity or ECC), a register
>could get hit by a cosmic ray, etc. Anyone care to estimate the number
>of random bit errors that will occur over 2000 years of PPro 200
>computation?

Now this is the most interesting thing that's come up in this list
in a while, IMO! I think the point about memory errors is the most
interesting. I don't care to start a whole non-parity vs. parity.
vs. ECC memory debate, but there used to be a very interesting
article at http://www.ee.ucla.edu/~rulnick/parity.html (it's not
there anymore, don't know where it is) called "Parity Questions
Answered" that gave some numbers with regards to memory errors.

It claimed something like one one-bit error every 4-6 months for
the average Pentium-class computer with 32MB, or something similar
(I wish I could find that document again- perhaps someone can solidify
these numbers). The chance of a multiple-bit error was several
orders of magnitude less.

Let's make a really rough estimate based on this:

The memory footprint of DESCHAL5 on my machine is about 1/2 MB,
so I could expect an error in memory in which the DESCHAL code
resides about once in 3 years. Admittedly, by probability skills
are next to nil, and this may prove it, but supposing that there
are currently 2000 Intel machines searching key blocks, there
should be a one-bit memory error in one of these machines within
the DESCHAL code every day or even more frequently, correct?

Fortunately, I have parity memory, which will catch a one-bit
error, and thus not cause a problem for the DESCHAL effort.
However, I'd guess that a large percentage of the Intel machines
running DESCHAL have non-parity memory, and wouldn't be able to
catch such errors.

Well, this all seems scary, but the chance of having a critical
memory error in a client that is crunching the key block with
the one and only true key, and that client reporting back "Key
not found" must be *extremely* tiny.

Probably smaller than the chance of a malicious attack. ;-)

Rod

--
Rodney R. Korte                   OS/2. Operate at a higher level.
korte@sabine.acs.psu.edu    ---> MIME, PGP (finger for key) welcome.
http://sharkbait.arl.psu.edu/

Crack DES NOW! http://www.frii.com/~rcv/deschall.htm