in reply to Regexes on Streams

It would be far better if Perl had internal support of some kind for this, but look at the following:
(?=\z(?{ die "end of string" })|)
This convoluted construct will cause the match to die if you reach the end of string at a given point in the RE. You can trap this in an eval. Take an RE and sprinkle these heavily and you can guarantee that if the RE reached the end of your current text at any point during the RE match, then you will find out about it and know to grab more text. You will need to tokenize the RE carefully to figure out where to insert the end of string tests. (And don't forget to remove them if you know that there is no more input to add!)

I looked around for a simpler way to do this. I didn't find one. It would be ideal if there was a regexp modifier to set "end of string" behaviour. But nobody seems to have implemented that...

Replies are listed 'Best First'.
Re: Re: Regexes on Streams
by tsee (Curate) on Oct 11, 2003 at 23:11 UTC
    Genius! Madness!
    This is such a good idea I'm jealous I didn't have it. No, wait. It's so mind-boggingly hackish I'm glad... Whatever, it's just a very cool hack.
    I'll play with it and see how I can make the buffer extension work with it. Current implementation of the module (not on CPAN yet) features a somewhat simpler approach that requires that a match stays exactly the same before and after a buffer extension. Thus, if the user is knowledgeable to use regexes that match delimiters shorter than what they set as the block to read per buffer extension, they're *fairly* safe.
    Anyway, I like the ${} approach better even if it's not going to work well. Just for the weirdness of it. :-)

    Steffen
      I didn't have it either. Ilya did. (Or at least had a trivial variation on it.)