in reply to ascii to binary

use List::Util qw( sum ); my $id = (sum map ord, map /./sg, $word) % 65536;

Ref: ord, List::Util

Update: The above will underflow on long strings and overflow on longer strings. Fix:

my $id = 0; foreach (map ord, map /./sg, $word) { $id = ($id + $_) % 65536; }

Update: What you are doing is called hashing. The following uses a better hashing function. Of course, that means it'll return a different number than yours.

use Digest::MD5 qw( md5 ); my $id = unpack('n', substr(md5($word), -2));

Ref: Digest::MD5, unpack

Replies are listed 'Best First'.
Re^2: ascii to binary
by Marsel (Sexton) on Dec 04, 2006 at 16:39 UTC
    thanks a lot !

    to answer the previous reply, my words are like :

    - 1237_at, 23493_s_at, ...
    - gsm12832, gsm23948, ...
    - or just float values.


    the idea was to derive a hash (thanks for respelling what i wanted to do !) of these 3 lists, and then have an ID like EA3D4B which would be unique, and independant of the order in which the words were given in each list.

    I think this will work perfectly, thanks again.

    marcel

      Note that an MD5 sum is not independent of the order of the words. You could however split, sort, then join to get the words in the same order each time.

      Your first technique would more properly be called a checksum and is independent of character order.

      Note too that 16 bits does not produce a very unique result compared with an MD5 hash which uses 128 bits. Depending on how many strings you are working with, there may be a fairly high chance that you will get identical checksums for different strings using only 16 bits.


      DWIM is Perl's answer to Gödel