Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Line-oriented input processing is easy in Perl, thanks to readline and its angle bracket syntax as in $line = <STDIN>. However, these popular constructs lack control over the size of what is returned, as has been lamented on several occasions. Such control would be beneficial where data from unreliable sources has to be handled. File::GetLineMaxLength, availiable on CPAN, created by our fellow monk robmueller, addresses this issue. This review discusses version 1.00.

The module is not yet listed in the Catalogue, but my guess at its DLSIP status would be adpOp.

Development stage: a - Alpha testing

The module works only in very special cases, has some serious implementation flaws and documentation issues. I'll elaborate below.

Language used: p - Pure Perl

Support level: d - Developer

Interface style: O - Object oriented

Public License: p - Standard-Perl

Usage

Example:
local $/ = "\n"; my $LIMIT = 1024; my $f = File::GetLineMaxLength->new(\*STDIN); my ($line, $tooLong); while (length($line = $f->getline($LIMIT, $tooLong))) { die "line too long" if $tooLong; # process $line }

File::GetLineMaxLength adds a layer of buffering to an already opened file connection. To read lines, you have to create an object that will hold the state of everything, and repeatedly call the object's getline method.

The constructor new takes an existing filehandle and an optional buffer size. The input record terminator is taken from $/ at the time of object creation -- I had rather expected either an explicit parameter or the behaviour of IO::Handle, which uses $/ dynamically at the time of each getline call.

The getline method takes a non-negative integer size limit and optionally a variable. This variable will be set by the method to 1 in case the size limit is hit and 0 otherwise. A limit of zero means no limit. The return value is a string and always defined. After reading to the end of file it is (supposed to be) empty. There is no distinction between normal end of file and error conditions. For the size limitation, line terminator characters are not counted.

If the number of input characters before a line terminator exceeds the given limit, the returned string will have exactly limit characters and no line terminator, the overflow flag will be set, and the next getline call will continue where the previous one broke off.

Open Issues

  • Getline goes into an endless loop if the input file does not end with an end of line character sequence.
  • The input record separator must be a nonempty string. Other values of $/ reserved for paragraph mode, file slurp mode or fixed length record mode are neither supported nor detected.
  • The buffering strategy makes the module unsuitable for interactive input or scenarios calling for alternating uses of getline and other file operations.
  • The POD documentation contains a wrong usage example (a while loop terminating on the boolean value rather than the size of the result, and wrong filehandle syntax), although it has a nice reference to PerlMonks.
  • Not amusingly, the README file is unaltered h2xs output with mismatched 0.01 version number, going all "blah blah" on us.

Conclusion

File::GetLineMaxLength populates an important and often underestimated niche: tools aiding in interface robustness. It could become useful if it were developed a bit further.

Update: Changed wording about documentation issues. Why has this section no preview button?


In reply to File::GetLineMaxLength by martin

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (4)
As of 2024-03-29 08:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found