rkg has asked for the wisdom of the Perl Monks concerning the following question:

Hi all --

I'm looking for a function, F(i), that takes an integer argument, i, in the range 1 .. 10^8 and returns a short string, say 8 characters long, using the 26 + 26 + 8 = 60 character set ‘A’..’Z’, ‘a’.. ‘z’, ‘2’ .. ‘9’. (I want to omit ‘0’ and ‘1’ to avoid ‘O’ and ‘l’ confusion).

F(i) must be invertible – that is, each unique i maps to a unique F(i), and each unique F(i) maps to a unique i. And I don't want psuedo-random function (eg just build up a random string then cache it to a file or a DB to ensure that string is not reused); I'm lookling for something deterministic.

The number of valid strings (10^8) is much less than the number of possible strings (60^8), so the chance of guessing a correct string is extremely low.

Do you have suggestions for doing this in perl? Do I want a MD5 hash or something? If there is a canned algorithm that is pretty close to this, I can change my requirements somewhat. I would prefer to keep the strings short to keep 'em human friendly.

Thanks, wise monks, for any tips and pointers --

rkg

Replies are listed 'Best First'.
Re: perl / math: integer obfu
by BrowserUk (Patriarch) on Jul 26, 2003 at 03:32 UTC

    Take a look at Math::BaseCalc

    use Math::BaseCalc; my $bc = Math::BaseCalc::new( digits=> [ 'A' .. 'Z', 'a' .. 'z', 2 .. 9 ] ); print $bc->to_base( 99999999 ); # Corrected Hq7un print $bc->from_base( 'Hq7un' ); # Corrected 99999999

    This is a straight translation and will only use 5 characters max. If you want to distribute the maping from 10^8 => 60^8 in some non-guessable fashion, you've a some more work to do.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

      Correcting a few typos in BrowserUk otherwise excellent post:
      use Math::BaseCalc; my $bc = Math::BaseCalc->new( digits=> [ 'A' .. 'Z', 'a' .. 'z', 2 .. 9 ] ); print $bc->to_base( 99999999 ); print $bc->from_base( 'Hq7un' );

      If you want to distribute the maping from 10^8 => 60^8 in some non-guessable fashion, you've a some more work to do.

      From the "obfu" in the title, I get the sense that this is what the OP intended; some type of bit shuffling may be sufficient to ensure that the sequence is not predictable, but I'm not sure how best to go about it...

Re: perl / math: integer obfu
by simonm (Vicar) on Jul 26, 2003 at 22:13 UTC
    As noted above, MD5 is a one-way function, and you need a two-way function for this purpose.

    After some false starts, I realized that reversing the number, so that the least significant digit was on the left, does a good job of making the encoded versions non-sequential.

    A bit more fiddling produced the below; fiddle with the three key_* constants to produce your customized local encoding pattern.

    use strict; use Math::BaseCalc; ### Change the three lines below to ensure your coding is unique my $key_bitxor = 1231456789; my $key_multip = 3; my @key_digits = ( 'N' .. 'Z', 2 .. 9, 'a' .. 'm', 'A' .. 'M', 'n' .. +'z' ); ### Coding and decoding functions my $bcalc = Math::BaseCalc->new( digits => \@key_digits ); sub numeric_obfu { my $number = shift; my $flipped = ( $number * $key_multip ) ^ $key_bitxor; $flipped =~ s/^(\d+?)(0*?)$/reverse($1).$2/e; return $flipped; } sub numeric_defu { my $result = shift; $result =~ s/(\d+?)(0*?)$/reverse($1).$2/e; $result = ( $result ^ $key_bitxor ) / $key_multip; return $result; } ### Print sample cases my @tests = ( 0..10, 998..1004, 9999995..10000004, 99999990..99999999 +); foreach my $sample ( @tests ) { my $obfu = numeric_obfu( $sample ); my $code = $bcalc->to_base( $obfu ); my $dcod = $bcalc->from_base( $code ); my $defu = numeric_defu( $dcod ); print "Converted $sample : $obfu : $code : $dcod : $defu : " . ( $defu == $sample ? 'OK' : $dcod) . "\n"; }

    Sample output:

    Converted 0 : 9876541321 : ZIRJIO : 9876541321 : 0 : OK Converted 1 : 9765413210 : Zmj3Mq : 9765413210 : 1 : OK Converted 2 : 7876541321 : XULhVH : 7876541321 : 2 : OK Converted 3 : 6976541321 : Vy7oVH : 6976541321 : 3 : OK Converted 4 : 3976541321 : STpuoH : 3976541321 : 4 : OK Converted 5 : 4976541321 : TczlBa : 4976541321 : 5 : OK Converted 6 : 5776541321 : UeJ3oH : 5776541321 : 6 : OK Converted 7 : 8676541321 : YWiXbO : 8676541321 : 7 : OK Converted 8 : 1876541321 : PdnHhH : 1876541321 : 8 : OK Converted 9 : 2876541321 : QHx84a : 2876541321 : 9 : OK Converted 10 : 9776541321 : ZAaLva : 9776541321 : 10 : OK Converted 998 : 1364541321 : OL684a : 1364541321 : 998 : OK Converted 999 : 4264541321 : SiQ3oH : 4264541321 : 999 : OK Converted 1000 : 7364541321 : Wh4Sva : 7364541321 : 1000 : OK Converted 1001 : 8364541321 : XLdJIO : 8364541321 : 1001 : OK Converted 1002 : 5364541321 : TtvqbO : 5364541321 : 1002 : OK Converted 1003 : 6764541321 : VHx84a : 6764541321 : 1003 : OK Converted 1004 : 3764541321 : Rqheva : 3764541321 : 1004 : OK Converted 9999995 : 4174491210 : SbT9tj : 4174491210 : 9999995 : OK Converted 9999996 : 7317449121 : WdDRLa : 7317449121 : 9999996 : OK Converted 9999997 : 8317449121 : XHMIlO : 8317449121 : 9999997 : OK Converted 9999998 : 1517449121 : OxS2EH : 1517449121 : 9999998 : OK Converted 9999999 : 4417449121 : SGrWZO : 4417449121 : 9999999 : OK Converted 10000000 : 9817449121 : ZDkWZO : 9817449121 : 10000000 : OK Converted 10000001 : 9174491210 : YnuiMq : 9174491210 : 10000001 : OK Converted 10000002 : 7817449121 : XQYtEH : 7817449121 : 10000002 : OK Converted 10000003 : 6917449121 : VtL2EH : 6917449121 : 10000003 : OK Converted 10000004 : 3917449121 : SP597H : 3917449121 : 10000004 : OK Converted 99999990 : 5582525841 : UXLODa : 5582525841 : 99999990 : OK Converted 99999991 : 8482525841 : XujxXH : 8482525841 : 99999991 : OK Converted 99999992 : 1682525841 : PWph6a : 1682525841 : 99999992 : OK Converted 99999993 : 2682525841 : QfzTRO : 2682525841 : 99999993 : OK Converted 99999994 : 9582525841 : Z8clKO : 9582525841 : 99999994 : OK Converted 99999995 : 6382525841 : VZhJqH : 6382525841 : 99999995 : OK Converted 99999996 : 3382525841 : R9zqjH : 3382525841 : 99999996 : OK Converted 99999997 : 4382525841 : SEWh6a : 4382525841 : 99999997 : OK Converted 99999998 : 7482525841 : WDa8dO : 7482525841 : 99999998 : OK Converted 99999999 : 4825258410 : TZ8Wmj : 4825258410 : 99999999 : OK

    Obviously, that's just obfuscated, not really encrypted, and if someone assembled a list of codes in incremental numeric sequence they'd be able to figure things out, but it's pretty opaque to your typical end-user.

Re: perl / math: integer obfu
by rkg (Hermit) on Jul 26, 2003 at 11:47 UTC
    Me again. Yes, the obfu is important here. I suppose I could use the straight base 60 approach (above) here and tack on a few digits of checksum to make it less guessable, but again, I have the sense MD5 or something similar might solve my problem here a little more securely, but I don't quite understand it...
    rkg
      The problem with using MD5 in this case is that MD5 is a one-way function. That is to say that it is easy to calculate the MD5 value of a given input, but it is hard to obtain the original input given the MD5 value.

      However, if you have a user input a value, you can store the MD5 value. Then, to validate that the user enters the same value next time, you calculate the MD5 value of the new input and check it against what was stored originally. This scheme is used to validate passowrds in every Unix system that I know (though MD5 isn't always the algorithm used). Also,

      Yes, the obfu is important here
      Are you hoping to implement "security through obscurity"? Some would liken that to locking a safe with a piece of tape. While it keeps the casual observer from figuring it out, someone who is determined enough will break it.

      thor