MikeM16 has asked for the wisdom of the Perl Monks concerning the following question:

This should be simple but I'm running into a problem. What I have is this:
open FILE, ">output.dat"; $data = "\xff\x01\x67"; print FILE $data; close FILE;
Now when I open that file in a hex editor, I expect to see exactly what I typed into $data: ff 01 67. Instead I get:
od -x output.dat 0000000 bfc3 6701 0000004
I'm running this using perl 5.8.0 on redhat 8. Any help, this is driving me crazy?

Replies are listed 'Best First'.
Re: Hex string output
by BrowserUk (Patriarch) on Oct 08, 2003 at 04:33 UTC

    I think your being bitten by Unicode conversion. You could try using :raw on the open

    open FILE, '>output.dat:raw' or die $!;

    or using binmode on FILE before writing to it.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail

      i'm never sure about binmode, never had to use it. same with utf8, but that's where your problem lies. go check out perldoc perldelta and see what it says about utf8

      $ perl -e '$f="\xff\x01\x67";print $f' | od -t x1 0000000 ff 01 67 0000003 $ perl -e '$f="\xff\x01\x67";use Encode; $f=encode("utf8", decode("iso +-8859-1",$f));print $f' | od -t x1 0000000 c3 bf 01 67 0000004
      binmode FILE

      solved my problem... thanks for the help, everyone. It was indeed that nasty business with the utf8/RedHat 9/Perl 5.8.0
Re: Hex string output
by davido (Cardinal) on Oct 08, 2003 at 04:19 UTC
    You probably need to unpack it if you want it rolled out to the stringified hex representation:

    use strict; use warnings; open FILE, ">output.dat"; my $data = "\xff\x01\x67"; print FILE unpack("H*",$data); close FILE;

    Update: Gulp... woops, you said you're using a hex editor, which presumably "unpacks" it to the stringified hex representation for you. Perhaps the problem is that you're saving it little-endian and reading it big-endian, or vice versa, in which case pack and unpack will still be helpful for converting back and forth.

    UPDATE2: I was unable to replicate the problem you describe with the following code snippet:

    use strict; use warnings; open FO, ">test.out"; my $data = "\xff\x01\x67"; print FO $data,"\n"; close FO; open FI, "<test.out"; while ( <FI> ) { chomp; print unpack("H*",$_), "\n"; } close FI;
    So my suspicion is one of the following: your hex editor is malconfigured, or there is more being saved to the output file than illustrated in your example code snippet, or there's some sort of filesystem layer acting here based on a locale definition.


    Dave


    "If I had my life to do over again, I'd be a plumber." -- Albert Einstein
Re: Hex string output
by Abigail-II (Bishop) on Oct 08, 2003 at 10:04 UTC
    Known problem, which should be fixed in Perl 5.8.1. From perldelta:
    For example, if you had "en_US.UTF-8" as your locale, your STDIN and STDOUT were automatically "UTF-8", in other words an implicit binmode(..., ":utf8") was made. This meant that trying to print, say, chr(0xff), ended up printing the bytes 0xc3 0xbf. Hardly what you had in mind unless you were aware of this feature of Perl 5.8.0. The problem is that the vast majority of people weren't: for example in RedHat releases 8 and 9 the default locale set- ting is UTF-8, so all RedHat users got UTF-8 filehandles, whether they wanted it or not. The pain was intensified by the Unicode implementation of Perl 5.8.0 (still) having nasty bugs, especially related to the use of s/// and tr///. (Bugs that have been fixed in 5.8.1) Therefore a decision was made to backtrack the feature and change it from implicit silent default to explicit con- scious option. The new Perl command line option "-C" and its counterpart environment variable PERL_UNICODE can now be used to control how Perl and Unicode interact at inter- faces like I/O and for example the command line arguments. See "-C" in perlrun and "PERL_UNICODE" in perlrun for more information.

    Abigail

Re: Hex string output
by DrHyde (Prior) on Oct 08, 2003 at 08:47 UTC
    If you're getting bitten by endianness issues (perl will spit out the bytes in your desired order, but your hex editor might be trying to be too clever for its own good) then you may find my module Data::Hexdumper useful, which lets you specify endianness and word size your self.