World's most popular travel blog for travel bloggers.

[Solved]: Choosing error rates for probabilistic algorithms

, , No Comments
Problem Detail: 

Probabilistic algorithms often have a parameter that allows one to tune the error rate, typically by running the algorithm repeatedly. This often gives an error rate of something like $2^{-k}$ for $k$ iterations. This is a fine situation to be in, because $2^{-k}$ can be made as close to $1$ as you like without having to make $k$ very big at all. At this point, the theoretician sits back contentedly, his or her job done.

In practical terms, though, are there any guidelines as to which value of $k$ should one choose? Obviously, there's no universal answer, since the answer in any particular situation will be a trade-off depending on the importance of avoiding errors and the cost of doing more iterations.

For example, choosing $k=10$ gives an error rate of about one in a thousand, which seems rather high for most purposes; choosing $k=60$ means the expected number of errors would still be less than one even if you'd run the algorithm once a second since the big bang.

Asked By : David Richerby

Answered By : Juho

As you say, this is application or situation dependant in general. However, a guideline I have encountered is "making the error probability smaller than the probability of a hardware failure". If I remember correctly, this is at least mentioned in the Mitzenmacher-Upfal book.

Say you wanted to replace a deterministic algorithm with a probabilistic algorithm, and still be confident (enough) it works. How often do hardware failures happen then?

Nightingale, Douceur, and Orgovan [1] analyzed hardware failure rates on a million consumer PCs. For instance, bus errors, microcode bugs, and parity errors in CPU caches issue a machine-check exception (MCE), indicating a detected violation of an internal invariant. Roughly, a CPU running for at least 5 days has a 1 in 330 chance of crashing due to an MCE (see Figure 2). They also observed laptops are more reliable than desktop machines, and underclocking helps as well.

In short, one guideline could be this: know the hardware you are running your algorithm on, and analyze how often the hardware makes critical mistakes (or take a good guess e.g. based on [1]).


[1] Nightingale, Edmund B., John R. Douceur, and Vince Orgovan. "Cycles, cells and platters: an empirical analysis of hardware failures on a million consumer PCs." Proceedings of the sixth conference on Computer systems. ACM, 2011.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/29091

3.2K people like this

 Download Related Notes/Documents

0 comments:

Post a Comment

Let us know your responses and feedback