in reply to Srand versus rand in thread Pi calculator
A trick I have mentioned before and may again.
If you need a large supply of random looking data, one of
the best approaches is to grab a large file produced out
of dynamic data (/dev/mem is often a good bet, there are
plenty of choices though), compress it, and then encrypt
it with a good algorithm. Samples from the resulting file
are for all intents and purposes, random.
Random source
by gryng (Hermit) on Feb 16, 2001 at 18:35 UTC
|
In fact, this is a good source of random data. However, because it is a good source, it could easily be bad for Monte-Carlo searching. As I've been suggesting in other posts, Monte-Carlo searching benefits from uniform-distributed numbers.
But truely random numbers are not statistically garunteed to be uniformly distributed (no, the law of averages does not work that way :) ), and so they cause Monte-Carlo searching to converge more slowly (but do not keep it from converging).
This is why psuedo-random numbers can, theoretically, be better than truely-random numbers, because often they are crafted to be uniformly distributed -- statistically. However, it often occurs in practice that pseudo-random numbers are not perfectly statistically uniformly distributed (what a mouthful), and so can easily lead to a mis-convergance.
Enter Quasi-random numbers. These are uniformly-distributed and have a bias towards non-repetition. This means that you still get a garunteed convergence and you get it faster (since there would be no clumps in your set).
Ok, time to go.
Ciao,
Gryn | [reply] |
|
Whether what you say about Monte-Carlo depends upon what
you are trying to do. In this case, absolutely. If you are using a Monte Carlo algorithm to calculate a probability, you are certainly better off trying to use
statistically uniformly distributed numbers than really
random numbers for exactly the reason you say. (Speed of
convergence.) Of course for the same problem you are even better off trying to turn it into an integration problem and then attempting standard numerical integration techniques. (Of course we have better means of calculating Pi, but I digress.)
However if you are going to do a large number of complex scenarios which involve multiple random decisions, and particularly if you will then compute summary statistics on those runs, then speed of convergence or no, it is probably safer to use random data for your random decisions.
On a related note, I remember having seen some research showing that chaotic systems can be surprisingly good at detecting pseudorandom input. So again if you are doing a Monte Carlo simulation of how a chaotic system will react, you are not guaranteed of accurate results from using pseudorandom numbers.
So to summarize, for simple problems you are right that the right pseudorandom sequence tends to converge more rapidly. But using good random data can prevent a variety of causes of spurious results.
| [reply] |
|
Oh, definitively tilly.
I thought we were talking about Monte-Carlo integration (sorry, I did
say Monte-Carlo searching). But yes, for -some- Monte-Carlo searches
uniform distribution and non repetitive bias would be bad things!
I think it is important to point out, like you did, that the real crux
of the matter is to understand what kind of random numbers you want and
why you want them.
Let's assume we're sticking to uniform distributions of some type and
do a quick summary of which ones we've discussed so far (to any that have
actually followed this discussion this far! lol :) ):
One: Truely-random numbers. In this case we are talking about a true
random source, that should be uniform, but we do not get any
garuntees about it. This is almost always an all around safe bet if you
can't decide. Also in some very sensitive conditions, this is the only
bet. E.g. the chaotic systems tilly from above mentions, for example
the http://www.cs.ualberta.ca/~darse/rsbpc.html. However for
Monte-Carlo integration these converge, but generally at 1/N**2 rate.
Two: Pseudo-random numbers. These are normally meant to be uniformly
distributed (using statistical garuntees), but in practice one finds
otherwise. These numbers should not generally not be used for security
unless you know what you are doing. The reason being that pseudo-random
numbers are predictable if you know or can guess the seed and the general
algorithm. For general purpose though, these are the best, because they
are fast and provide what many programs need. For Monte-Carlo integration
they should converge, but because of bad implementations they often
won't.
Three: Quasi-random numbers. These are sequences that are garunteed
to be uniform statistically, and also have a strong bias to not repeating
themselves. This means that as you pick more numbers the become closer
and closer together, but in a uniform way. Example is the Hamilton
sequence mentioned in the posts above. These are excellent for
Monte-Carlo integration because they lead to a 1/N convergance rate and
are garunteed to converge. These numbers tend to be very predictable, so
they should probably not be used in security for the same reasons as
Pseudo-random.
Welp, back to work :)
Ciao,
Gryn
| [reply] |
|
|