in reply to Yet another regex bug.

I would drive this from the principle that removing the comment should make no difference to what the program does. That is, your example should be exactly the same as /abc+/; it should match one or more c. If it's acting the same as /(?:abc)+/, I'd call it a bug.

Replies are listed 'Best First'.
Re: Re: Yet another regex bug.
by hossman (Prior) on Nov 12, 2002 at 03:11 UTC

    That implies that "removing the comment" is the translation...

    /abc(?#comment)+/  ----> /abc(?#comment)+/ /abc+/
    

    Where as I (and evidently at least 2 other people) expect it to be the translation...

    /abc(?#comment)+/  ------> /abc()+/
    

    Updated: forgot to acctually make the translation i was trying to show.

      Well, your first line is the identity transformation, which isn't removing anything. ;-)

      The parentheses surrounding the ?# are part of the syntax; that is, the comment marker in a regexp begins (?#, not ?#. (Check perlre: all the "funny" extended-pattern elements start with (?, one of the reasons being that it's a mnemonic to "question" what's coming next.1) Thus if you're removing the comment you should remove the parentheses as well.

      1I don't buy the explanation, by the way, but it's there..