Starting with tilly's idea, and attempting to generalise it, I came up with this.

#! perl -slw use strict; use re 'eval'; sub Re_Stream { my( $re_user, $extend ) = @_; die "Usage: Re_Stream( regex, coderef )" unless defined $re_user and ref $extend eq 'CODE'; return qr[ (?: \Z (?(?{ $extend->() })|(?!) }) ) | $re_user ]x; } my $buf = 'abcdefghijklmnopqrstuvwxyz'; my $c = 'A'; sub extend{ $buf .= ($c++) x 100; return length $c < 2 } my $re_stream = Re_Stream( qr[(..)(...)], \&extend ); print $re_stream; my $i = 0; print "${ \++$i }: $1|$2" while $buf =~ m[$re_stream]g;

The sub Re_Stream(), takes a regex and a coderef. The regex can be any regex (in theory:), and the coderef should be a function that will extend the stream beyond it's current limit. This function should return true if it has extended the stream, and false if there is no more to come.

As coded, the while running the regex will continue to match against the stream until the extender function returns false. I'm not sure if this is progress. The upside is that you no longer have to inspect the user's regex in ordr to work out where to insert the code block to extend the buffer. In fatc you don't have to modify the user regex at all. However, there are a couple of problems with it as it stands.

  1. If the match crosses the boundary of the buffer being extended, a null match is returned.
  2. Ay attempt I made to shorten the pre-trucate the string, Ie. To discard some part of the front of the string that had already been processed seemed to "confuse" the regex.
  3. As is, it requires use re 'eval'; which may or may not be a problem.

I've only made a half-hearted attempt at fixing these so far, but thought that I would throw it open to see if anyone else can take it further, or dismiss it as unworkable.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!


In reply to Re: Regexes on Streams (a partial solution?) by BrowserUk
in thread Regexes on Streams by tsee

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.