Why, what, where to unpack?! (was: Files, unpack and text values)

I can see clearly what the problem with my unpack template is now. Its not unpack at all that's giving me the trouble, its just that I don't fully understand what to expect from a binary file! As you pointed out I only have one element in my template, but how many should I have? I can't just guess a number I want and force the result back, or can I?

I've done some experimenting with unpack to try and obtain the same results I got from opening the file in a hex editor (UltraEdit). The file contents (accodring to the editor) should be something like this:

47 49 46 38 39 61 0f 00 10 00 a2 ff 00 84 86 84
df e0 df c6 c7 c6 c0 c0 c0 00 00 00 00 00 00 00
00 00 00 00 00 21 f9 04 01 00 00 03 00 2c 00 00
00 00 0f 00 10 00 40 03 40 48 ba ac f3 60 80 49
a9 88 24 ca 0a ae 78 d9 b3 75 a4 16 4a 92 47 0e
01 6a 8d de 10 72 1c 48 df c0 29 ee 52 8b 0a c0
e0 67 17 12 1a 81 32 cd 67 29 b4 ed 8c c4 c8 91
97 a4 5a 09 b2 86 56 91 00 00 3b
[download]

and my script is giving me this output with a "H1024" template (line breaks I inserted myself just for the sake of comparing them both at the 16th position):

4749463839610f001000a2ff00848684\n
dfe0dfc6c7c6c0c0c000000000000000\n
000000000021f90401000003002c0000\n
00000f00100040034048baacf3608049\n
a98824ca0a 
ae78d9b375a4164a92470e016a8dde10\n
721c48dfc029ee528b0a 
c0e0671712
[download]

My unpack result seems to be munching the last line, insert line breaks in odd places, all of which I don't quite grasp... and by the way, you can find the image I'm using as an example here. I have tried using the 1st block of hexes with pack to see if I could get an image output, but that didn't help much either. So I have reached a dead end with the unpack method so far, and I refuse to beleive that this cannot be acheived with perl. Newly revised questions:
How can one offer unpack a template without knowing what to expect in return?

Is it correct to run a loop on the file handle in this case?

Why does my hex editor break the strings at every 16th position?

Should I run a break on the 16th position with substr, or similar?

Is there any documentation out there that covers this?
Thank you again for your patience, attention, and coping with my stuborness!

Comment on Why, what, where to unpack?! (was: Files, unpack and text values) Select or Download Code

Replies are listed 'Best First'.
RE: Why, what, where to unpack?! (was: Files, unpack and text values) by chromatic (Archbishop) on Jun 26, 2000 at 07:38 UTC
Answers to questions: If you don't know what to expect in return, unpack's not the right function for the job. No. A file (whether binary or text) is just a stream of bytes. Text files are said to be stored line-by-line because, in certain places, they have specific characters. $/ is also known as the Input Record Separator because it contains that special character -- usually a newline. The I/O routines read a chunk of bytes from a file and split it up based on the presence of whatever's in $/. That happens to default to \n. In a binary file, there are no lines. It's a nice power of two, and it fits nicely across the screen. No real technical reason of which I'm aware. Nope. Low-level I/O handling routine documentation, probably.	[reply]
RE: Why, what, where to unpack?! (was: Files, unpack and text values) by mdillon (Priest) on Jun 26, 2000 at 06:34 UTC
i believe that a single element consisting of 'H' is the template you are interested in. i'm not sure if it is correct* to loop on a binary file handle like that, but it certainly doesn't make much sense to me: "for each line in this binary file" is a contradiction in terms, since binary files don't have any "lines". i don't know why your hex editor breaks strings at that interval, but it seems a little narrow for you to store it that way. i would use 36 bytes per line with no spaces, myself. the following code works fine for me. however, i would probably use MIME::Base64 to encode the data if it were my choice. plain hex is just a little too fat for my taste. `open IMG, "foo.gif" or die "Couldn't open image: $!\n"; undef $/; $image = <IMG>; print unpack("H*", $image); close IMG;` [download]	[reply] [d/l]
RE: RE: Why, what, where to unpack?! (was: Files, unpack and text values) by BBQ (Curate) on Jun 26, 2000 at 07:13 UTC
You are absolutely right! That did the trick... This is what I turned up with to get it looking somewhat with my hex editors output: `open IMG, "foo.gif" or die "Couldn't open image: $!\n"; undef $/; $image = <IMG>; $hex = unpack("H*", $image); close IMG; while ($txt = substr($hex,0,32,'')) { $txt =~ s/(..)/$1 /g; print $txt."\n"; }` [download] Now all I have to do is figure out where those extra bytes went (the results are still not exactly the same), and who's goofing up on this one. I bet you its the editor... #!/home/bbq/bin/perl # Trust no1!	[reply] [d/l]
RE: RE: RE: Why, what, where to unpack?! (was: Files, unpack and text values) by mdillon (Priest) on Jun 26, 2000 at 07:18 UTC
the slurp/unpack method does not lose any bytes. i checked my code's round-trip integrity with cmp(1) like so: `[mike@prometheus ~]$ perl -e 'undef $/; open IMG, "foo.gif"; print pack("H", unpack("H", <IMG>));' \| cmp foo.gif -` [download]	[reply] [d/l]