in reply to Negative Lookbehind question
The code line that you are analyzing removes the match - as you may already found out, s/left/right/g; matches left, and replaces it with right. In your case, the right side expression is an empty string, so the match is effectively removed.
The tricky thing here is that the look behind is not part of the match. So, this line:
$addr =~ s/(?<![|\s])\s{25,}[^|]+//g; # extra right-text
finds a block of at least 25 spaces in a row followed by some characters that are not | (this is our match), but only if such block is preceded by something that is not | or whitespace. Then, this block is removed.
In the sample input you provided, there are 25 spaces between CITY and STATE ZIPCODE. So we have a match, and this line removes the whole match - those 25 spaces right after CITY, together with STATE ZIPCODE, up until the | character. That is why you don't "grab the STATE ZIPCODE section" - because this nasty little line grabs it earlier, and removes it. When you extend the match criteria to 27 spaces, STATE ZIPCODE is no longer matching, and is not removed.
You can try playing around with an online regex tester - http://regex101.com. It could make the way those expressions work a bit more clear. Good luck!
- Luke
|
---|