Re^8: Math help: Finding Prime Numbers

To me, it's fairly apparent that Anonymous Monk was thinking of getting a list of prime numbers, putting them in a flat text file, and reading that file into memory whenever the list of primes is needed.

How I store the primes would depend on what queries one would want to perform. For instance, if I wanted to answer queries "give me the Nth prime", I'd store the primes as k bytes each (k depends on the number of primes I store), the Nth prime is then stored on in the k bytes starting from offset (k-1)*N. That would, after opening the file, require a single read().

If I want to know whether a certain number is prime (or to give me a list of all primes between two numbers), I'd spend less than half a bit on each number. For instance, I might spend a byte to encode the primeness of a sequence of 30 numbers (a number N can only be prime if N % 30 is one of 1, 7, 11, 13, 17, 19, 23 or 29 (the exceptions are 2, 3 and 5)). So, I only need to store the primeness of all numbers of the form 30k+1, 30k+7, 30k+11, 30k+13, 30k+17, 30k+19, 30k+23 and 30k+29. Ergo, to determine whether a number N is prime, use the following steps:

If N equals 2, 3 or 5, N is prime.
If N mod 30 isn't 1, 7, 11, 13, 17, 19, 23 or 29, the number isn't prime.
Otherwise, look up the corresponding bit in the floor(N/30)th byte to see whether this number is prime.

This algorithm also needs a single read(). To get a list, just read the appropriate bytes and look at the individual bits. Using this encoding, 1 Gb of disk space (which in modern computer can easily be read into memory), I can encode the primeness of all numbers up to 32,212,254,720.

There are more compact encodings of course.

Comment on Re^8: Math help: Finding Prime Numbers