The trick mentioned earlier about using md5 hashes and cutting them up into suitable lengths seems like a nifty idea, if a bit heavy-weight.
If it would be sufficient to replace alphabetics and digits with randomly selected other alphabetics and digits (to mask personal-id info like names and credit-card numbers, but not disrupt the actual character class relations), something like this might do (not tested):
sub mask_it
{
my ($instr) = @_;
my $retstr = '';
while ($instr =~ /[0-9a-z]/i) {
if (/^([^0-9a-z]*)([a-z]+)/i) {
$retstr .= $1; #pass non-alphas as-is
my $orig = $2; #replace alphas with new ones
$retstr .= join('',map { chr(65+int(rand(26))) } split(//,
+$orig));
}
elsif (/^([^0-9a-z]*)(\d+)/i {
$retstr .= $1; #pass non-digits as-is
my $orig = $2; #replace digits with new ones
$retstr .= join('',map { chr(48+int(rand(10))) } split(//,
+$orig));
}
$instr =~ s/^(\W*)$orig//; #remove from input
}
$retstr .= $instr; #pass on anything that's left over
return $retstr;
}
Note that this only outputs upper-case replacements for any input letters; if you want to be more "flexible", it should be easy to add that in. (There are probably a few ways to optimize this, but this gets the basic idea across.)
update: added the "join('',...)" around each "map {...}" to make sure the string assignment would work properly. Also changed the "while" condition from /\w/ to /[0-9a-z]/i (and similarly for the "if" condidtions), to make sure that underscores don't throw it into an endless loop. |