The code line that you are analyzing removes the match - as you may already found out, s/left/right/g; matches left, and replaces it with right. In your case, the right side expression is an empty string, so the match is effectively removed.

The tricky thing here is that the look behind is not part of the match. So, this line:

$addr =~ s/(?<![|\s])\s{25,}[^|]+//g; # extra right-text

finds a block of at least 25 spaces in a row followed by some characters that are not | (this is our match), but only if such block is preceded by something that is not | or whitespace. Then, this block is removed.

In the sample input you provided, there are 25 spaces between CITY and STATE ZIPCODE. So we have a match, and this line removes the whole match - those 25 spaces right after CITY, together with STATE ZIPCODE, up until the | character. That is why you don't "grab the STATE ZIPCODE section" - because this nasty little line grabs it earlier, and removes it. When you extend the match criteria to 27 spaces, STATE ZIPCODE is no longer matching, and is not removed.

You can try playing around with an online regex tester - http://regex101.com. It could make the way those expressions work a bit more clear. Good luck!

- Luke


In reply to Re: Negative Lookbehind question by blindluke
in thread Negative Lookbehind question by crusty_collins

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.