in reply to Re: Re: Regular Exp parsing
in thread Regular Exp parsing

Zapowork: \A..\z is just as efficient as ^..$

\A..\z should be used to anchor a pure string, wheras ^..$ should be used to anchor a line. For most cases, the difference is subtle enough that, virtually, there is no difference (this is why cookbook examples, and a lot of existing code is able to get away with never using \A..\z). Still, it is proper to be accurate. If it is not expected, or acceptable for a string to end with '\n', \z should be used instead of $.

For example:

if ($ARGV[0] =~ /^-o$/) { ... }

Will match "-o" or "-o\n". For command line arguments, "-o\n" should not be allowed. The more accurate expression is:

if ($ARGV[0] =~ /\A-o\z/) { ... }

The reason I am so rigid about this point is that I have been hit by the difference in production code. I am now very strict about use \A..\z for strings and ^..$ for lines.

Replies are listed 'Best First'.
Re: Re: Re: Re: Regular Exp parsing
by Zapawork (Scribe) on Dec 13, 2002 at 21:44 UTC
    Hi Mark,

    That's great information. I didn't know that \n would not be literraly matched when using $ as an anchor. I normally chomp all my strings before they get to that point so I hadn't encountered it. Knowing this now though is there a reason as to why? Does $ assume EOL characters?

    BTW - Did you mean to put a \z in your initial example?

    Dave -- Saving the world one node at a time

      The $ thing is due to legacy behaviour, and the fact that when most people say $, they mean "end of string, or end of line, but not the end of line itself." There is no question that $ is one of the most useful regexp primitive operators there is. Just, people are very comfortable with using it, and so, sometimes it gets used in places where it is questionable to use, or very rarely, in places where problems can arise.

      In my initial example, I used ':' instead of '\z', because the original example looked as if the year was trailed with a ':' and since I didn't know exactly what was after the ':', I figured it would be simpler to just not care, and align the regexp based on the ':'. In the original example, the ':' may have been a typo, in which case I probably would have used \z as you suggest.

      Cheers,
      mark