Converting ascii to numbers

toonski has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Converting ascii to numbers (unpack) by tye (Sage) on Feb 15, 2004 at 05:42 UTC
`$a= join ' ', unpack 'C*', $a;` [download] - tye	[reply] [d/l]
Re: Re: Converting ascii to numbers (unpack) by davido (Cardinal) on Feb 15, 2004 at 06:12 UTC
tye has got it. Though I haven't benchmarked it, unpack is a very efficient method. Here's a disection: `$a = join ' ', unpack 'C', $a;` [download] That is roughly the same thing as: `@temparray = unpack 'C', $a; $a = join ' ', @temparray;` [download] The first part -- the unpack -- uses the template, 'C*', which reads like this: 'C' takes one byte and converts it to an unsigned char value (base 10). Note that per perldoc -f pack 'C' only works with byte-width characters. For Unicode you would probably use U, but that doesn't appear to be an issue in your case. The asterisk in the unpack template basically just means to repeat that 'C' template for as long as there are more bytes to unpack into unsigned char values. So the result is that you get a list of unsigned char values (which happen to be the ASCII values) corresponding to the characters (the bytes) in the original string. The next line -- the join line -- just serves to concatenate together the list of unsigned char values into one long string with each value separated from the next by a single space character (presumably so you have some prayer of knowing where one unsigned char value ends and the next one starts in the string). In tye's example, the @temparray is avoided by just allowing unpack to spill its list of unsigned char values into the parameter list of join. Dave	[reply] [d/l] [select]
Re: Re: Re: Converting ascii to numbers (unpack) by Anonymous Monk on Feb 15, 2004 at 08:23 UTC
Convert::ASCII::String is an implementation of above described manner.	[reply]
Re: Re: Re: Converting ascii to numbers (unpack) by ysth (Canon) on Feb 16, 2004 at 04:08 UTC
There are a number of assumptions involved in benchmarking this, but my try shows s/// as twice as fast: `use Benchmark 'cmpthese'; use strict; use warnings; my $big; $big .= join '',map chr, 0..255 for 0..255; print length($big), " characters.\n"; sub subst { my $tmp; ($tmp=$big) =~ s/(.)/ord($1).' '/seg; $tmp } sub unpac { my $tmp; $tmp = join ' ', unpack 'C*', $big; $tmp } print length(subst()), " characters in ascii numbers.\n"; print "whoops!\n" if subst() ne unpac().' '; cmpthese( -10, { subst => \&subst, unpac => \&unpac });` [download]	[reply] [d/l]
Re: Converting ascii to numbers by blokhead (Monsignor) on Feb 15, 2004 at 05:21 UTC
There are probably a zillion ways to do this. The simplest to me seems like a s///e substitution: `$x =~ s/(.)/ord $1/egs;` [download] If you ever want to get the data back though, this is a bad encoding. For instance do you decode "64" as `chr(6).chr(4)` or just `chr(64)`? Maybe you should pad out the ASCII values to 3 digits (though probably won't work with some wide unicode characters) `$x =~ s/(.)/sprintf "%03d", ord $1/egs;` [download] Then to get the characters back: `$x =~ s/(\d{3})/chr $1/g;` [download] blokhead	[reply] [d/l] [select]
Re: Converting ascii to numbers by diotalevi (Canon) on Feb 15, 2004 at 05:19 UTC
Your original code has a bug by not noticing that . doesn't match newlines without /s. Also, are you so sure you want a straight numeric translation? How would you know where cone character starts and another begins? I used a different sprintf format so you can see where characters end. `s((.))(sprintf "0x%02x ", ord $1)gs`	[reply] [d/l]
Re: Converting ascii to numbers by Skeeve (Parson) on Feb 15, 2004 at 11:46 UTC
In the sense of TMTOWDI: `$x=join ' ', map ord,split //,$x;` [download]	[reply] [d/l]
Re: Converting ascii to numbers by Abigail-II (Bishop) on Feb 15, 2004 at 16:42 UTC
Is there a way to get around this using a single regex or without resorting to pack() or am I better off just running through each character in a for loop with substr and converting it that way? `pack()` is by far the fastest method of doing so, and a regex is most likely to be the second fastest method. But you want to dismiss both, and it's not clear to me why. `substr()` might be a "best of the rest", but does it really matter? Abigail	[reply]