in reply to [bugs?] perldoc perlre, \G and pos()

BUT what's really confusing me is that pos($str) is empty afterwards!

pos is updated by every search. It is either advanced on success, or reset on a match failure (unless you use /c). If it didn't reset, a match somewhere in the program could affect an unrelated match elsewhere in the program.

produces [...] and an endless loop!

Position zero is the start of the string. It doesn't surprise me that it thinks it hasn't matched yet.

It lets you do something silly like /.\G/ assuming you know what you are doing. Expect problems if you break that trust by trying to match the character before the start of the string and nothing else.

Replies are listed 'Best First'.
Re^2: [bugs?] perldoc perlre, \G and pos()
by LanX (Saint) on Sep 29, 2009 at 14:50 UTC
    If you look closely at the code I posted you will see that position zero (or "uninitialized") produces inconsistent results.

    It's not that I care so much which result it produces, as long as they are consistent!

    But depending on a magically hidden memory or state is for sure a profound error.

    Cheers Rolf

    UPDATE:

    Position zero is the start of the string. It doesn't surprise me that it thinks it hasn't matched yet.

    An endless loop could only be a result of infinitely repeating matches not of the opposite. And following the documentation I quoted, it should always return the length of the match after \G, which is clearly zero, so no need for an endless loop.

      Like I said, GIGO. You're trying to make it start and end at pos -1

      An endless loop could only be a result of infinitely repeating matches not of the opposite.

      I said it *thinks* it hasn't matched yet. A reasonable belief when pos == 0.

        And I said if it "*thinks* it hasn't matched yet" the while condition is false not true!

        Cheers Rolf

      It's not that I care so much which result it produces, as long as they are consistent!
      I much rather have bugs that produce inconsistent results than consistent results. If it produces consistent results, there will be code that relies on it, and it will (politically) harder to fix the bug. If the results were inconsistent anyway, fixing the bug is very unlikely to break existing code.
        A very good - "political"- point! 8)

        Anyway I didn't say that consistency is more important than functionality!

        Let me explain .. if we're talking about rare edge cases any consistent output can be - from a higher point of view - a valid interpration.

        An example, lets take a binary operator x which is used on two boole values A and B, such that

        X 0 1 B - - 0|0 1 1|1 ? A

        ? denotes the edgecase of A=B=1 - in ikegamis words GI (=garbage input) - which produces inconsistent output, sometimes 0 sometimes 1, which causes many people to avoid this unpredictable operator.

        Now surprisingly making it consistent is always a win. With ?=1 we get the OR operator, with ?=0 it's XOR!

        Ok it's a simplified example, our case is more complicated, but I hope you get my point, why consistency is -technically - a win.

        In our problem ?

        Well it's the edge case of resetting pos. The result of the next match is inconsistent for an input of pos=undef and pos=0, which doesn't really make much sense.

        So consistency is the first step on the way to kill bugs, your point is more about if improvements should be done step by step!

        Cheers Rolf

      Like I said, GIGO. You're trying to make it start and end at pos -1

      An endless loop could only be a result of infinitely repeating matches not of the opposite.

      I said it *thinks* it hasn't matched yet. A reasonable belief when pos == 0.