I don't think I'm more experienced, but...

The way you come up with a "token" string amounts to trying different strings of unlikely characters repeated 6 times (e.g. "~~~~~~", "``````", etc). If the underlying assumption is that the module will always be used to munge text data, why not use odd-ball control characters for this function -- e.g. a string like "\x7f\x1f\x7f\x1f" is quite unlikely to show up in any human-readable text file, but it ought to serve your needs just as well as any "visible" character string. Even null bytes might do the trick.

In any case, it seems like stream mode will complicate things for you rather a lot. The caller would need to pass a file handle, n'est-ce-pas? You'd have to be able to figure out when you've read up to a record boundary (without reading the whole file), which at first guess might involve reading fixed-length buffers and parsing to know whether the buffer ends with a partial record, partial field, or even a partial character (if handling utf8 data).

(Or maybe you could set $/ based on the user-supplied "split" value -- except the latter can be a regex, which won't work for $/; and in any case, you still need to parse to know when a read buffer ends in mid-record because the given $/ instance happened to be within a quoted field.)


In reply to Re: Munging Streamed Data by graff
in thread Munging Streamed Data by Ovid

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.