in reply to Re: Re: Regex backreference problem.
in thread Regex backreference problem.

Because character classes are determined when the regex is compiled (which is different than when the Perl statement that contains the regex is compiled). There is no regex 'node' for "character class that consists of these hard-coded characters plus the characters in this backreference". The only character class regex node type is "hard-coded list of characters" that was built when the regex was compiled (not after it ran part way and figured out what $1 might end up being).

                - tye
  • Comment on Re^3: Regex backreference problem. (compile-time)

Replies are listed 'Best First'.
Re: Re^3: Regex backreference problem. (compile-time)
by BrowserUk (Patriarch) on Oct 10, 2003 at 07:28 UTC

    That makes sense. Thanks.

    I still wish there was an easier way to say "Don't match this character (or this constant) here, but consume the appropriate number of characters".

    ((?!something).{length_of_something})

    Works okay whenyou know the length of something, but if something comes from a backreference, then you don't (always).

    While I'm wishing, I'd also like it if lookbehinds didn;t prejudge the issue of whether the it was variable length. I tried to do

    ([abc]) .* (?<!\1)(something)

    but I guess that this is teh same issue. It doesn't know that \1 is fixed length -- even though it has already seen the capture parens and could determine that it is -- as this regex could be incorporated into another which contained another set of captures which precede the one seen, and shifted the goal posts as it were.

    {Sigh} Maybe in P6, capture parens to $1, $2 etc. will be done away with in favour of a capture to named variables. You can do this now with (?{ $var - $^N }) which is useful, but it has a bad effect on performance.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail