in reply to Re: Boolean math: Fill in the blanks.
in thread Boolean math: Fill in the blanks.
This falls out of--is an extension to--the code in Re: A series of random number and others (10MB / 8seconds).
There the problem was to fairly pick 50% of a set. By setting the bits of a bit vector, where each bit == 1 choice, to randomly generated bit patterns, I very quickly achieve a close approximation to a 50% pick. Then it is just a case of randomly inverting bits until the required 50% ratio is achieved.
Works great for 50%, but as the desired ratio moves away from 50% (or 0% or 100%), the number of bits requiring correction increases and the chances of picking an appropriate bit reduces, so the number of iterations required to achieve the required ratio increases exponentially.
Then I remember the affect of ANDing and ORing two random values produces 25% & 75% repectively. I used it to produce random fill patterns for graphics, where it gives a more naturalistic shading affect with less aliasing than a fixed fill pattern.
That allows me to initialise the bit-vector to 0%, 25%, 50%, 75% or 100%, whichever is closest to the required ratio, very quickly. So the number of iterations needed to achieve the correction falls from max 1/4 of the total range to 1/8th. And the more you can subdivide the range, the closer you can get with the fast initialisation and less correction is required. With 128 subdivisions needing just 7 terms, it possible to get within less that 0.4% through the initialisation.
So, to answer the question, they are acceptable because they are only a starting point. If the required ratio was 99.99999% (in a large set), it's quicker to start with 100% set and unset bits randomly until I achive the desired ratio, than start at 75% and go the other way. Same thing at the other end. If you're trying to pick 3 from a billion, initialise to a billion 0s and then set 3 random bits.
It'd be a stupid way to pick 3 from a billion, but it makes for completeness. The algorithm really comes into its own when you starting selecting 10s of millions from billions.
|
---|