Re: problems packing 8 bit data
by BrowserUk (Patriarch) on Aug 15, 2003 at 17:25 UTC
|
P:\test>perl58 -le"$_ = pack 'C*', 0x7f .. 0x81; print length, qq[ : '
+$_' ]"
3 : '⌂Çü'
How are you verifying the contents of the string?
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.
| [reply] [d/l] |
Re: problems packing 8 bit data
by Anonymous Monk on Aug 15, 2003 at 17:40 UTC
|
I am verifying it via:
perl -e 'print pack("C*", 0x7e..0x81);' | od -tx1
And its not a seperator thing, because adding the extra character (0x7e) you don't get the 0xc2 between every byte, only preceeding the bytes above 0x80.
And yes, I'm currently using perl 5.8.0 on RH 9.0
| [reply] [d/l] |
|
|
I willing to bet that the utf-ifying occurs when you print it, not when you pack it. Try assigning to a variable and then print the length.
Why print would utf-ify the output will come down to what PerlIO 'layers' (eg. :raw, :utf etc.) are used on STDOUT, which I know nothing about beyond that they exist. Take a look at perliol.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.
| [reply] |
|
|
After reviewing your response, and the testing I've been doing in between,I have come to the conclusion, that I think I agree with you.
The string is correct, its the output interpretation that needs to be affected. The binmode does alter the resultant output.
Now if someone could just explain what translation is really going on here in this particular bit pattern example.
ie. a 0x80 is actually represented as a 0xc2 0x80, because ???.
| [reply] |
|
|
|
|
Re: problems packing 8 bit data
by Kageneko (Scribe) on Aug 15, 2003 at 17:18 UTC
|
I don't quite know how to help with your problems, but it seems to be fixed in a later snapshot of 5.8.1:
> /appl/cpc/devel/bin/perl -v
This is perl, v5.8.1 built for PA-RISC2.0-thread-multi
(with 1 registered patch, see perl -V for more detail)
> /appl/cpc/devel/bin/perl -e '$a = pack("C*", 0x7f..0x81); for ( $i =
+ 0; $i < length($a); ++$i ) { print ord(substr($a, $i, 1)) . " " }; p
+rint "\n"'
127 128 129
I have a vague feeling that the 'c2' you are seeing is the list separator character. You might try setting $" = '' and see what happens. Also try $\ and $/. | [reply] [d/l] [select] |
|
|
No, c2 80 is the UTF-8 encoding of the value 0x80.
| [reply] |
Re: problems packing 8 bit data
by Willard B. Trophy (Hermit) on Aug 25, 2003 at 15:26 UTC
|
This seems to depend on $ENV{LANG}. On a Redhat 9 box, with LANG set to en_CA.UTF-8, I get the extra UTF-8 characters. Setting LANG to en_US works as you'd want, though.
-- bowling trophy thieves, die! | [reply] |