Confession of a Perl Hacker

My name is Dan, I am a perl hacker, and I have a problem.

I hack away at perl for sometimes 10, 12, even 14 hours a day, partly on the job, partly on my own time. Alot of my time revolves around perl. Though, that's not my problem, I'll get to that in a sec.

Just for the record, I love perl. Not like, love it.

I own almost all the O'Reilly books, OO Perl, have a TPJ subscription (for now anyway), Advanced Perl - even Mastering Algorithms with Perl, and yes, I've read it.. well, most of it anyway. I try to soak up as much knowledge as I can.

I've built web-based interfaces to databases, created database abstraction layers, done OO with multiple inheritance just for fun, built and designed enterprise-level business systems. On the job I even lead a team of 7 other perl programmers and sys-admins.

I've recently submitted by first CPAN module, something I've been pushing towards for a while now.

That's what makes it so hard to admit my problem. The longer I go, the harder it is. Now.. this is really embarrassing.. I've got a confession to make:

I don't know how to use pack or unpack. Not one bit.

It's not that I've never tried, mind you. I have. I've looked in every book I own, and some that I don't, searching for good explanations of pack or unpack. Most of them are copies of the perldoc perlfunc. They all have the same table with two columns: templates and short descriptions. I'm not sure if these incomplete explanations are because the authors don't understand it well enough to explain it. Or that it's so readily apparent, and that everyone must know how to use it, so why bother to explain it. I hope it's not the latter.

Is it just me? Does anyone else have this problem too?

What I need to learn this is a good pointer to a tutorial, but I've found none, anywhere. Excluding that, if anyone has some good real-world examples, I would be all ears.

For example, I've heard that you can use unpack to split strings. I can do this with regexs, split, or substr, so understand the concept. The best way for me to learn would be to see the most common way, using other perl-ish methods, then the equivalent using pack/unpack. Something like:

"You can split strings using regexs:"

#splits string with regex

"Or using split:"

#splits string with split

"Or even using unpack:"

#magical unpack string split

"Why you would want to use it is because it's faster/cooler/shorter/better"

My goal is to learn pack/unpack over the next few weeks, enough to help write a small mini-faq here. It's one of those gaps in my perl knowledge that is bugging me, and I suspect I'm not alone.

Would any monks be kind enough to post their examples and explanations of using pack/unpack, so that similarly "pack impaired" monks can become enlightened as well?

Dan
"pack impaired Perl Hacker"

Comment on Confession of a Perl Hacker Select or Download Code

Replies are listed 'Best First'.
Re: Confession of a Perl Hacker by tadman (Prior) on Jan 22, 2001 at 20:04 UTC
As clemburg pointed out, the widespread use of text-based data-transmission standards (HTML, XML and the like) over the "old-school" binary formats means that the pack() statement is showing up in fewer and fewer programs and modules. That people can live a long and fulfilling life without even touching it is perhaps a sign of progress. pack() allows you to build "packed binary" scalars. In other words, the pack() template specifies how you want things organized in memory, in a byte-by-byte manner. You aren't really sure how things are organized within Perl if you have created an array, but if you pack() this array, you will know exactly where things stand. Why would you care? It depends on your application, and if the input and output data comes in a precisely defined "binary" format, you will likely be using pack() and unpack() to interface. For example, binary files like GIF, JPEG have headers that are stored in binary, not ASCII, and they need to be "decoded" to be understood by Perl. These file formats were created by C programs, and C programs work in a different way than Perl does. Where Perl programmers work with scalars (i.e. strings), arrays and hashes, C programmers work with some basic variable types and struct definitions. A "struct" is really just zero or more variables crammed together, end-to-end, into a managable package that can be allocated, deleted, copied, passed from function to function, and what have you without worrying too much about the internals. Most C programs make use of "struct" like Perl programs make use of arrays and hashes, as convenient ways to store data. Here's an example that illustrates the difference: `Perl: my (%record) = (); $record{'id'} = 419; $record{'time'} = time(); $record{'name'} = "Quentin"; C: struct record { int id; time_t time; char name[8]; } a_record; a_record.id = 419; a_record.time = time(); strcpy (a_record.name, "Quentin");` [download] In the Perl example, you could put anything into the hash %record without concern for type, or even the key that you are inserting it into. In C, though, you have to specify what "keys" you can use, and more specifically, what type of data each is prepared to accept. "id" can only be an "int", and "name" can only contain 8 characters (i.e. a "string"). C is pretty strict about that stuff, and if you step outside the lines, either the compiler freaks out, or your program crashes or behaves strangely. Here's where pack() and unpack() come into play. Let's say you had to read data from a file that was created by a C program that used the "record" struct, and you want to modify some of this stuff and put it back right where it came from. Here's how you might go about doing that: my (%record) = (); my ($packed_record); my ($packed_record_size) = 4+4+8; # Open the file and read a single record out of it. open (FILE, "$data_file"); read (FILE, $packed_record, $packed_record_size); close (FILE); # Unpack the record to decode it ($record{'id'},$record{'time'},$record{'name'}) = unpack ("lla8", $packed_record); # Make a change $record{time} = time; $packed_record = pack ("lla8", $record{'id'},$record{'time'},$rec +ord{'name'}); open (FILE, ">$data_file"); print $packed_record; close (FILE); [download] The first parameter of the pack() and unpack() calls is dictated by the format of the struct. In this case, the first two variables are of type "long int" (as 'time_t' is an alias, and 'int' is of type 'long' by default on most 32-bit compilers). The reason for using 'a' instead of 'A' is that C strings are "NULL padded" by default. In other words, the string "Quentin" is actually represented in memory as follows: `'Q' 'u' 'e' 'n' 't' 'i' 'n' \x00` [download] The last byte is used by the C library to figure out when the string is supposed to stop. Perl uses another method, so you don't have to fuss about ASCII 0 bytes in your strings, thankfully. Basically, if you need to use pack() and unpack(), you will have to figure out the format of what you're reading, which is usually described in a C context, and more often than not, in the form of ".h" header files or RFCs which show you how the bytes are organized and should be decoded. The documentation on pack() and unpack() is so terse likely because the utility and application of these functions is pretty clear to most 'C'-type programmers who used 'struct'. Certainly, though, you recognize that it must be improved to be intelligible to your average modern Perl programmer.	[reply] [d/l] [select]
Re: Confession of a Perl Hacker by clemburg (Curate) on Jan 22, 2001 at 17:05 UTC
First: you are not alone ... see this and this remark by our fellow monk Dominus (search for word "pack" to find remark). Second, I liked the explanation of pages 220-223 of Effective Perl Programming. The rest is about finding good examples. Third, I think that the declining popularity of `pack()` and `unpack()` might be a consequence of the increasing emphasis on text-based protocols, e.g., XML and friends. You just don't need `pack()` and `unpack()` that often anymore (at least, that's my experience). Christian Lemburg Brainbench MVP for Perl http://www.brainbench.com	[reply] [d/l] [select]
(adamsj) Re: Confession of a Perl Hacker by adamsj (Hermit) on Jan 23, 2001 at 01:06 UTC
So what you're saying is, you hack Perl, but pack makes you Herl?	[reply]
Re: Confession of a Perl Hacker by mwp (Hermit) on Jan 22, 2001 at 17:17 UTC
I too have been hacking Perl for quite some time (four years or so, although almost never as my full-time job) and have trouble with pack. I stared at this for a few, brief moments of total confusion before firing off a /msg to The Schwartz and asking for a hint.* Essentially what unpack is doing in that example is defining spacing and data types for each token in the string and returning a list of formatted tokens, which is summarily joined. Crazy stuff. I've also found that pack/unpack are used far less than they were "back in the day." Good luck. I look forward to reading this FAQ when you're through. (merlyn cruelly deferred me to the perldocs for pack/unpack. In hindsight, I don't mind so much, because (I think) I actually learned something. :)	[reply]
Re: Confession of a Perl Hacker by doug (Pilgrim) on Jan 22, 2001 at 23:40 UTC
Hmm, I don't have any magical pointers to give you, but it seems that your problem is more of getting your mind wrapped around the problem than understanding the syntax. One way of thinking about pack/unpack is to think about copying data from Perl's format (large, but quickly accessed) to C's format (dense, but slower (for Perl at least)). Even if C isn't your goal, this generalizes to say that pack/unpack are good ways to convert to/from standard representations. I used pack/unpack to communicate across TCP/IP with a rather crappy server that served structs in C. My perl had to prepare data that would mimic the data structures used in that C compiler. pack was the way to do it, and unpack() to understand what it replied. A quick example would be `typedef struct my_data { char c; int i; } MY_DATA;` [download] if you wanted to write $letter and $integer into something that looked like that, you would use `my $packet = pack 'c xxx N', $letter, $integer;` [download] Now you can just write $packet to your socket and it gets to the other side OK. When the server responds, simply use `my ($letter, $integer) = unpack 'c xxx N', $resp;` [download] This shows the basics. Hopefully this helps. The problem you might be having is that some people use pack/unpack to do magic. If you want to play around with bits/bytes in an un-perl-like fashion this is how to do it (well, vec() helps too). Try to understand what they are doing before delving into how they are doing it (trite, I know). - doug PS: All those stupid 'x' pack/unpack fields are because most compilers add pad bytes to keep alignment regular. In every unix compiler I've looked at, sizeof(MY_DATA) would be 8, although only 5 bytes are actually needed. PS #2: I'm doing this from memory, and I'm starting to get senile....	[reply] [d/l] [select]
Re: Confession of a Perl Hacker by Beatnik (Parson) on Jan 22, 2001 at 19:15 UTC
The Cookbook uses the following example (listed on page 4) `# get a 5-byte string, skip 3, grab 2 8-byte strings, then the rest ($leading, $s1, $s2, $trailing) = unpack("A5 x3 A8 A8 A",$data);` [download] `x3` meaning to "ignore" 3 bytes (jump forwards), while `Xm` means to jump m bytes back. `A5` meaning to "get" 5 Ascii (space padded) bytes. Another example would be to pack/unpack to/from binary or hexadecimal. `$string = "My uncle John is Jamaica"; $binary=unpack("B",$string);` [download] and `$string=pack("B",$binary);` Similar for Hexadecimal... `$string = "My uncle John is Jamaica"; $hexadecimal=unpack("H",$string); $string=pack("H*",$hexadecimal);` [download] Anyway, those were just the simple examples =) Greetz Beatnik ... Quidquid perl dictum sit, altum viditur.	[reply] [d/l] [select]


We don't bite newbies here... much
	PerlMonks