My name is Dan, I am a perl hacker, and I have a problem.
I hack away at perl for sometimes 10, 12, even 14 hours a day, partly
on the job, partly on my own time. Alot of my time revolves around
perl. Though, that's not my problem, I'll get to that in a sec.
Just for the record, I love perl. Not like, love it.
I own almost all the O'Reilly books, OO Perl, have a TPJ subscription
(for now anyway), Advanced Perl - even Mastering Algorithms with Perl, and
yes, I've read it.. well, most of it anyway. I try to soak up
as much knowledge as I can.
I've built web-based interfaces to databases, created
database abstraction layers, done OO with multiple inheritance
just for fun, built and designed enterprise-level
business systems. On the job I even lead a team of 7 other
perl programmers and sys-admins.
I've recently submitted by first CPAN module,
something I've been pushing towards for a while now.
That's what makes it so hard to admit my problem. The
longer I go, the harder it is. Now.. this is really
embarrassing.. I've got a confession to make:
I don't know how to use pack or unpack. Not one bit.
It's not that I've never tried, mind you. I have. I've looked
in every book I own, and some that I don't, searching for good
explanations of pack or unpack. Most of them are copies
of the perldoc perlfunc. They all have the same
table with two columns: templates and short descriptions. I'm not
sure if these incomplete explanations are because the authors don't
understand it well enough to explain it. Or that it's so readily
apparent, and that everyone must know how to use it, so why bother
to explain it. I hope it's not the latter.
Is it just me? Does anyone else have this problem too?
What I need to learn this is a good pointer to a tutorial, but
I've found none, anywhere. Excluding that, if anyone has some
good real-world examples, I would be all ears.
For example, I've heard that you can use unpack to split strings.
I can do this with regexs, split, or substr, so understand the
concept. The best way for me to learn would be to see the most
common way, using other perl-ish methods, then the equivalent
using pack/unpack. Something like:
"You can split strings using regexs:"
#splits string with regex
"Or using split:"
#splits string with split
"Or even using unpack:"
#magical unpack string split
"Why you would want to use it is because it's
faster/cooler/shorter/better"
My goal is to learn pack/unpack over the next few weeks, enough
to help write a small mini-faq here. It's one of those gaps in
my perl knowledge that is bugging me, and I suspect I'm not alone.
Would any monks be kind enough to post their examples and
explanations of using pack/unpack, so that similarly
"pack impaired" monks can become enlightened as well?
Dan
"pack impaired Perl Hacker"
Re: Confession of a Perl Hacker
by tadman (Prior) on Jan 22, 2001 at 20:04 UTC
|
As clemburg pointed out, the widespread use of text-based
data-transmission standards (HTML, XML and the like) over
the "old-school" binary formats means that the pack() statement
is showing up in fewer and fewer programs and modules. That
people can live a long and fulfilling life without even
touching it is perhaps a sign of progress.
pack() allows you to build "packed binary" scalars. In other
words, the pack() template specifies how you want things
organized in memory, in a byte-by-byte manner. You aren't
really sure how things are organized within Perl if you
have created an array, but if you pack() this array, you
will know exactly where things stand. Why would you care?
It depends on your application, and if the input and output
data comes in a precisely defined "binary" format, you will
likely be using pack() and unpack() to interface.
For example, binary files like GIF, JPEG have headers that
are stored in binary, not ASCII, and they need to be
"decoded" to be understood by Perl. These file formats were
created by C programs, and C programs work in a different
way than Perl does.
Where Perl programmers work with scalars (i.e. strings),
arrays and hashes, C programmers work with some basic
variable types and struct definitions.
A "struct" is really just zero or more variables crammed
together, end-to-end, into a managable package that can be allocated, deleted,
copied, passed from function to function, and what have you
without worrying too much about the internals. Most C
programs make use of "struct" like Perl programs make use
of arrays and hashes, as convenient ways to store data.
Here's an example that illustrates the difference:
Perl:
my (%record) = ();
$record{'id'} = 419;
$record{'time'} = time();
$record{'name'} = "Quentin";
C:
struct record
{
int id;
time_t time;
char name[8];
} a_record;
a_record.id = 419;
a_record.time = time();
strcpy (a_record.name, "Quentin");
In the Perl example, you could put anything into the hash
%record without concern for type, or even the key that you
are inserting it into. In C, though, you have to specify
what "keys" you can use, and more specifically, what type
of data each is prepared to accept. "id" can only be an
"int", and "name" can only contain 8 characters (i.e.
a "string"). C is pretty strict about that stuff, and if
you step outside the lines, either the compiler freaks out,
or your program crashes or behaves strangely.
Here's where pack() and unpack() come into play. Let's say
you had to read data from a file that was created by a C
program that used the "record" struct, and you want to
modify some of this stuff and put it back right where it
came from. Here's how you might go about doing that:
my (%record) = ();
my ($packed_record);
my ($packed_record_size) = 4+4+8;
# Open the file and read a single record out of it.
open (FILE, "$data_file");
read (FILE, $packed_record, $packed_record_size);
close (FILE);
# Unpack the record to decode it
($record{'id'},$record{'time'},$record{'name'})
= unpack ("lla8", $packed_record);
# Make a change
$record{time} = time;
$packed_record = pack ("lla8", $record{'id'},$record{'time'},$rec
+ord{'name'});
open (FILE, ">$data_file");
print $packed_record;
close (FILE);
The first parameter of the pack() and unpack() calls is
dictated by the format of the struct. In this case, the
first two variables are of type "long int" (as 'time_t'
is an alias, and 'int' is of type 'long' by default on
most 32-bit compilers). The reason for using 'a' instead of
'A' is that C strings are "NULL padded" by default. In
other words, the string "Quentin" is actually represented in
memory as follows:
'Q' 'u' 'e' 'n' 't' 'i' 'n' \x00
The last byte is used by the C library to figure out when
the string is supposed to stop. Perl uses another method,
so you don't have to fuss about ASCII 0 bytes in your
strings, thankfully.
Basically, if you need to use pack() and unpack(), you will
have to figure out the format of what you're reading,
which is usually described in a C context, and more often
than not, in the form of ".h" header files or RFCs which
show you how the bytes are organized and should be decoded.
The documentation on pack() and unpack() is so terse likely
because the utility and application of these functions is
pretty clear to most 'C'-type programmers who used 'struct'.
Certainly, though, you recognize that it must be improved
to be intelligible to your average modern Perl programmer. | [reply] [d/l] [select] |
Re: Confession of a Perl Hacker
by clemburg (Curate) on Jan 22, 2001 at 17:05 UTC
|
First: you are not alone ... see this
and this remark by our fellow monk Dominus (search for word "pack" to find remark).
Second, I liked the explanation of pages 220-223 of Effective Perl Programming. The rest is about finding good examples.
Third, I think that the declining popularity of pack() and unpack() might
be a consequence of the increasing emphasis on text-based protocols, e.g., XML and friends.
You just don't need pack() and unpack() that often anymore (at least, that's my experience).
Christian Lemburg
Brainbench MVP for Perl
http://www.brainbench.com
| [reply] [d/l] [select] |
(adamsj) Re: Confession of a Perl Hacker
by adamsj (Hermit) on Jan 23, 2001 at 01:06 UTC
|
So what you're saying is, you hack Perl, but pack makes you Herl? | [reply] |
Re: Confession of a Perl Hacker
by mwp (Hermit) on Jan 22, 2001 at 17:17 UTC
|
I too have been hacking Perl for quite some time (four
years or so, although almost never as my full-time job)
and have trouble with pack. I stared at this
for a few, brief moments of total confusion before firing
off a /msg to The Schwartz and asking for a hint.*
Essentially what unpack is doing in that example is
defining spacing and data types for each token in the
string and returning a list of formatted tokens, which is
summarily joined. Crazy stuff.
I've also found that pack/unpack are used
far less than they were "back in the day."
Good luck. I look forward to reading this FAQ when
you're through.
(merlyn cruelly deferred me to the
perldocs for pack/unpack. In hindsight, I don't mind
so much, because (I think) I actually learned
something. :) | [reply] |
Re: Confession of a Perl Hacker
by doug (Pilgrim) on Jan 22, 2001 at 23:40 UTC
|
Hmm, I don't have any magical pointers to give you, but it
seems that your problem is more of getting your mind
wrapped around the problem than understanding the syntax.
One way of thinking about pack/unpack is to think about
copying data from Perl's format (large, but quickly
accessed) to C's format (dense, but slower (for Perl at
least)). Even if C isn't your goal, this generalizes to say
that pack/unpack are good ways to convert to/from standard
representations.
I used pack/unpack to communicate across TCP/IP with a
rather crappy server that served structs in C. My perl had
to prepare data that would mimic the data structures used in
that C compiler. pack was the way to do it, and unpack() to
understand what it replied.
A quick example would be
typedef struct my_data
{
char c;
int i;
} MY_DATA;
if you wanted to write $letter and $integer into something
that looked like that, you would use
my $packet = pack 'c xxx N', $letter, $integer;
Now you can just write $packet to your socket and it gets to the other side OK. When the server responds, simply use
my ($letter, $integer) = unpack 'c xxx N', $resp;
This shows the basics. Hopefully this helps. The problem
you might be having is that some people use pack/unpack to
do magic. If you want to play around with bits/bytes in an
un-perl-like fashion this is how to do it (well, vec() helps
too). Try to understand what they are doing before delving
into how they are doing it (trite, I know).
- doug
PS: All those stupid 'x' pack/unpack fields are because most
compilers add pad bytes to keep alignment regular. In every
unix compiler I've looked at, sizeof(MY_DATA) would be 8,
although only 5 bytes are actually needed.
PS #2: I'm doing this from memory, and I'm starting to get
senile.... | [reply] [d/l] [select] |
Re: Confession of a Perl Hacker
by Beatnik (Parson) on Jan 22, 2001 at 19:15 UTC
|
The Cookbook uses the following example (listed on page 4)
# get a 5-byte string, skip 3, grab 2 8-byte strings, then the rest
($leading, $s1, $s2, $trailing) = unpack("A5 x3 A8 A8 A*",$data);
x3 meaning to "ignore" 3 bytes (jump forwards), while Xm means to jump m bytes back.
A5 meaning to "get" 5 Ascii (space padded) bytes.
Another example would be to pack/unpack to/from binary or hexadecimal.
$string = "My uncle John is Jamaica";
$binary=unpack("B*",$string);
and
$string=pack("B*",$binary);
Similar for Hexadecimal...
$string = "My uncle John is Jamaica";
$hexadecimal=unpack("H*",$string);
$string=pack("H*",$hexadecimal);
Anyway, those were just the simple examples =)
Greetz
Beatnik
... Quidquid perl dictum sit, altum viditur. | [reply] [d/l] [select] |
|
|