As
clemburg pointed out, the widespread use of text-based
data-transmission standards (
HTML,
XML and the like) over
the "old-school" binary formats means that the pack() statement
is showing up in fewer and fewer programs and modules. That
people can live a long and fulfilling life without even
touching it is perhaps a sign of progress.
pack() allows you to build "packed binary" scalars. In other
words, the pack() template specifies how you want things
organized in memory, in a byte-by-byte manner. You aren't
really sure how things are organized within Perl if you
have created an array, but if you pack() this array, you
will know exactly where things stand. Why would you care?
It depends on your application, and if the input and output
data comes in a precisely defined "binary" format, you will
likely be using pack() and unpack() to interface.
For example, binary files like GIF, JPEG have headers that
are stored in binary, not ASCII, and they need to be
"decoded" to be understood by Perl. These file formats were
created by C programs, and C programs work in a different
way than Perl does.
Where Perl programmers work with scalars (i.e. strings),
arrays and hashes,
C programmers work with some basic
variable types and struct definitions.
A "struct" is really just zero or more variables crammed
together, end-to-end, into a managable package that can be allocated, deleted,
copied, passed from function to function, and what have you
without worrying too much about the internals. Most C
programs make use of "struct" like Perl programs make use
of arrays and hashes, as convenient ways to store data.
Here's an example that illustrates the difference:
Perl:
my (%record) = ();
$record{'id'} = 419;
$record{'time'} = time();
$record{'name'} = "Quentin";
C:
struct record
{
int id;
time_t time;
char name[8];
} a_record;
a_record.id = 419;
a_record.time = time();
strcpy (a_record.name, "Quentin");
In the Perl example, you could put anything into the hash
%record without concern for type, or even the key that you
are inserting it into. In C, though, you have to specify
what "keys" you can use, and more specifically, what type
of data each is prepared to accept. "id" can only be an
"int", and "name" can only contain 8 characters (i.e.
a "string"). C is pretty strict about that stuff, and if
you step outside the lines, either the compiler freaks out,
or your program crashes or behaves strangely.
Here's where pack() and unpack() come into play. Let's say
you had to read data from a file that was created by a C
program that used the "record" struct, and you want to
modify some of this stuff and put it back right where it
came from. Here's how you might go about doing that:
my (%record) = ();
my ($packed_record);
my ($packed_record_size) = 4+4+8;
# Open the file and read a single record out of it.
open (FILE, "$data_file");
read (FILE, $packed_record, $packed_record_size);
close (FILE);
# Unpack the record to decode it
($record{'id'},$record{'time'},$record{'name'})
= unpack ("lla8", $packed_record);
# Make a change
$record{time} = time;
$packed_record = pack ("lla8", $record{'id'},$record{'time'},$rec
+ord{'name'});
open (FILE, ">$data_file");
print $packed_record;
close (FILE);
The first parameter of the pack() and unpack() calls is
dictated by the format of the struct. In this case, the
first two variables are of type "long int" (as 'time_t'
is an alias, and 'int' is of type 'long' by default on
most 32-bit compilers). The reason for using 'a' instead of
'A' is that C strings are "NULL padded" by default. In
other words, the string "Quentin" is actually represented in
memory as follows:
'Q' 'u' 'e' 'n' 't' 'i' 'n' \x00
The last byte is used by the C library to figure out when
the string is supposed to stop. Perl uses another method,
so you don't have to fuss about ASCII 0 bytes in your
strings, thankfully.
Basically, if you need to use pack() and unpack(), you will
have to figure out the format of what you're reading,
which is usually described in a C context, and more often
than not, in the form of ".h" header files or RFCs which
show you how the bytes are organized and should be decoded.
The documentation on pack() and unpack() is so terse likely
because the utility and application of these functions is
pretty clear to most 'C'-type programmers who used 'struct'.
Certainly, though, you recognize that it must be improved
to be intelligible to your average modern Perl programmer.