Imagine my surprise when I saw a reference to
my module in Seekers. I read my CPAN mail
everyday, but have been busy at work, so I missed
this whole thread.
Thank you for the bug report and yes, in light of
Dave Cross' "Data Munging with Perl", a highly
useful book, this module should be
re-written to use pack/unpack.
++'s and credit due
- bitwise for using Devel::Leak to track down the error
- runrig for tracking down the inefficiency and the
source of the bug
- tilly for being my devil's advocate
But if fixed data formats are your need, then I recommend becoming friends with Perl's built-in utilities for that, tools like pack and unpack. Also davorg has a book about this kind of data manipulation which I have not read, but a lot of people seem to like.
The purpose of this module is to decouple the
description of the
information parsed from the actual
process of parsing it. Pack/unpack are for the actual process of parsing. If the description and process are bound together, then it becomes more difficult for external parse description to be used at will.
for example, what if you wanted to have data entry operators enter a huge collection of field names and field widths? It is much easier for them to enter these sans Perl syntax.
Also, certain industry vendors do use fixed-length data.
Valley Media, the fulfillment house for amazon.com,
cdnow.com, and several other major .coms only
receives (for this see Text::FixedLength) and
transmits (for this see Parse::FixedLength)
fixed length data. Their major competitor, global fulfillment, was using XML, but all of their high-techery did not save them from going out of business.
So, to summarize:
- fixed-length data may not be desirable (who likes
counting whitespace fields in a big file), but it is probably here to stay, just like VAX computers.
- I emphasize when data processing with Perl, the following steps:
- input processcreate a representation of what is to be parsed in a form both readable and enterable by non-technical types
- munge processcreate something which takes the data to be parsed and this representation of what is to be parsed and does the parsing
- output processFeed this general data (usually a perl nested data structure) into something which generates output files or SQL statements for data re-storage.
- In short, this is very similar to the edict in "Data Munging" : decouple input, munging, and output processes which is decreed in the table of contents of Cross' book. So, I make a strong invitation for you to refute this process of data munging and provide support of another more superior means of data processing in Perl.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.