in reply to Re: Files
in thread Files, unpack and text values

I can see clearly what the problem with my unpack template is now. Its not unpack at all that's giving me the trouble, its just that I don't fully understand what to expect from a binary file! As you pointed out I only have one element in my template, but how many should I have? I can't just guess a number I want and force the result back, or can I?

I've done some experimenting with unpack to try and obtain the same results I got from opening the file in a hex editor (UltraEdit). The file contents (accodring to the editor) should be something like this:
47 49 46 38 39 61 0f 00 10 00 a2 ff 00 84 86 84 df e0 df c6 c7 c6 c0 c0 c0 00 00 00 00 00 00 00 00 00 00 00 00 21 f9 04 01 00 00 03 00 2c 00 00 00 00 0f 00 10 00 40 03 40 48 ba ac f3 60 80 49 a9 88 24 ca 0a ae 78 d9 b3 75 a4 16 4a 92 47 0e 01 6a 8d de 10 72 1c 48 df c0 29 ee 52 8b 0a c0 e0 67 17 12 1a 81 32 cd 67 29 b4 ed 8c c4 c8 91 97 a4 5a 09 b2 86 56 91 00 00 3b
and my script is giving me this output with a "H1024" template (line breaks I inserted myself just for the sake of comparing them both at the 16th position):
4749463839610f001000a2ff00848684\n dfe0dfc6c7c6c0c0c000000000000000\n 000000000021f90401000003002c0000\n 00000f00100040034048baacf3608049\n a98824ca0a ae78d9b375a4164a92470e016a8dde10\n 721c48dfc029ee528b0a c0e0671712
My unpack result seems to be munching the last line, insert line breaks in odd places, all of which I don't quite grasp... and by the way, you can find the image I'm using as an example here. I have tried using the 1st block of hexes with pack to see if I could get an image output, but that didn't help much either. So I have reached a dead end with the unpack method so far, and I refuse to beleive that this cannot be acheived with perl. Newly revised questions:Thank you again for your patience, attention, and coping with my stuborness!

Replies are listed 'Best First'.
RE: Why, what, where to unpack?! (was: Files, unpack and text values)
by chromatic (Archbishop) on Jun 26, 2000 at 07:38 UTC
    Answers to questions:
    • If you don't know what to expect in return, unpack's not the right function for the job.
    • No. A file (whether binary or text) is just a stream of bytes. Text files are said to be stored line-by-line because, in certain places, they have specific characters. $/ is also known as the Input Record Separator because it contains that special character -- usually a newline. The I/O routines read a chunk of bytes from a file and split it up based on the presence of whatever's in $/. That happens to default to \n. In a binary file, there are no lines.
    • It's a nice power of two, and it fits nicely across the screen. No real technical reason of which I'm aware.
    • Nope.
    • Low-level I/O handling routine documentation, probably.
RE: Why, what, where to unpack?! (was: Files, unpack and text values)
by mdillon (Priest) on Jun 26, 2000 at 06:34 UTC
    i believe that a single element consisting of 'H*' is the template you are interested in.

    i'm not sure if it is correct to loop on a binary file handle like that, but it certainly doesn't make much sense to me: "for each line in this binary file" is a contradiction in terms, since binary files don't have any "lines".

    i don't know why your hex editor breaks strings at that interval, but it seems a little narrow for you to store it that way. i would use 36 bytes per line with no spaces, myself.

    the following code works fine for me. however, i would probably use MIME::Base64 to encode the data if it were my choice. plain hex is just a little too fat for my taste.

    open IMG, "foo.gif" or die "Couldn't open image: $!\n"; undef $/; $image = <IMG>; print unpack("H*", $image); close IMG;
      You are absolutely right! That did the trick... This is what I turned up with to get it looking somewhat with my hex editors output:
      open IMG, "foo.gif" or die "Couldn't open image: $!\n"; undef $/; $image = <IMG>; $hex = unpack("H*", $image); close IMG; while ($txt = substr($hex,0,32,'')) { $txt =~ s/(..)/$1 /g; print $txt."\n"; }
      Now all I have to do is figure out where those extra bytes went (the results are still not exactly the same), and who's goofing up on this one. I bet you its the editor...

      #!/home/bbq/bin/perl
      # Trust no1!
        the slurp/unpack method does not lose any bytes. i checked my code's round-trip integrity with cmp(1) like so:
        [mike@prometheus ~]$ perl -e 'undef $/; open IMG, "foo.gif"; print pack("H*", unpack("H*", <IMG>));' | cmp foo.gif -