I suppose this a bit of an expert question.
perlvar states that "[...] the value of "$/" is a string, not a regex.". Which is sad. And no longer strictly true.

Doing regex matching on streams is tricky at best and in order to work really well in all cases, it would require a different regular expression engine than perl's.

So I wrote File::Stream. Not implementing a regular expression engine (insert maniac laughter here), but implementing "regexes on streams" by means of progressive buffering -> matching -> buffer expansion -> matching... (see below for some comments on inherent problems with this approach)

With the current implementation, you can already do things like this:

use File::Stream; my $stream = File::Stream->new($filehandle); $/ = qr/\s*,\s*/; print "$_\n" while <$stream>;
It can also do quite a bit more, so consider having a look at the module's synopsis, the pasting of which is considered a waste of screen space here. A few important problems, however, remain.

Most importantly, infinite regexes on streams tend to introduce infinite strings into your memory. Too bad we don't live in the ideal Turing machine world, but this can't be helped.
Furthermore, given that regexes are used on the current buffer, they may match less than they would if the next X bytes were also part of the buffer. Like the former issue, this likely cannot be fixed for good.

Steffen


In reply to Regexes on Streams by tsee

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.