Re^2: How do I get a unique Perl Interpreter ID?
by wrog (Friar) on Dec 01, 2011 at 10:09 UTC
|
Are you just trying to generate unique values as a guid or seed for a randomizer?
Mainly guids.
I would think it would be sufficient to just use microtime and rand().
microtime can get screwed by a really fast machine.
As I understand it, mod_perl clones all of its interpreters in one swell foop and if one is doing things the way they recommend by putting all of the nasty module loading and initialization into the master process so that the clone operation has almost nothing left to do beyond creating a bunch of entries in a page table with copy-on-write tags, it's not hard to imagine it eventually being possible for multiple interpreters to get cloned in the same microsecond.
and if the seeding for rand() is likewise done in the master process so that all interpreters are proceeding from the same seed, that's also going to be a lose.
As it happens I do need to seed a random number generator as well. /dev/urandom mostly takes care of that, however if you pull too much from there, then you reduce the entropy available to other processes making them less secure. So if we can get our distinct interpreter IDs from a different source, so much the better. There's also the small matter that /dev/urandom is somewhat broken on older versions of Linux, in which case having extra known-to-be-different junk to throw into the pot will be better than nothing.
rand(), by the way, is completely inadequate for generating random numbers, at least not if you want to be secure about it (far too predictable...).
| [reply] [d/l] [select] |
|
|
Why do you think some interpreter ID (if such a thing did exist) would be fundamentally better from a randomness/predictability perspective than time+rand, or something similar?
Do you actually need cryptographically secure randomness, or simply distinctness (as one might have inferred from your OP)? A simple counter (like a sequence in a DB) would be producing distinct values, although they're essentially 100% predictable. And as time is like a counter - automatically incremented externally of your program - it doesn't seem like a bad choice in case distinctness is all that you need (and if you're worried about being returned the same microsecond, just wait until it has advanced...)
| [reply] |
|
|
no, you had it right the first time:
The interpreter ID is for distinctness, not randomness/unpredictability. (again, I'm using a counter; totally predictable)
The point is to have each process/thread independently generating a sequence of keys that are then used to write into a shared cache and life will be a lot easier if I can just know up front that each process/thread has its own namespace and they're just not going to be clobbering each other, in spite of the fact that they are all running the exact same code.
... if I have to be checking the cache first in order to make sure there isn't a value already written there at that key I'm about to use, and also having to do some kind of locking to make sure nobody else is slipping in a write to that key between the check and when I do my own write, etc... that's going to massively slows things down, not to mention being much harder to get right.
Where randomness is needed is not for the keys but for the values that are being written, which are indeed supposed to be secret/unpredictable/etc... and in particular it's essential that the values being produced on one thread provide no clue as to what values might be being produced on another thread, and so we again need distinctness in the seeding...
| [reply] |
|
|
clone operation has almost nothing left to do beyond creating a bunch of entries in a page table with copy-on-write tags, it's not hard to imagine it eventually being possible for multiple interpreters to get cloned in the same microsecond.
Have you tried to actually timing it? Only kernel have access to the page table, and even if perl could mark pages as copy-on-write it wouldn't help, as it has to change addresses, so no COW when cloning interpreters. Perl has to duly copy all data from the original interpreter to the new one, and the more modules loaded the more data it has to copy, the more time it takes.
| [reply] |
|
|
Actually, now that I think about it some more, the real issue here is that once you have concurrency, you can just as easily have cloning happening in parallel, at which point, it doesn't actually matter how long it takes to perform a single thread-clone.
That is, as long as there is the possibility that two cloning operations in parallel threads can finish at the same time, we're toast if we're depending solely on the clock to distinguish the resulting interpreters.
| [reply] |
|
|
Sounds like you're overthinking the problem. Add in pid or whatever if you're that worried - the pid's can be reused but you're not going to get the same pid and the same microtime unless the process takes less than 1 microsecond to run, which I highly doubt. Short of that, figure a way to sample ambient sound. Or you could always hit yourself in the face until your IQ lowers to a point where you aren't worrying about this any more.
| [reply] |
|
|
| [reply] |