in reply to Re^3: Replacing 3 and 4 digit numbers.
in thread Replacing 3 and 4 digit numbers.

This is a log for a building of apartments. Each apartment/unit number is a 3 or 4 digits. When a log entry is entered, I use the code to link the 3 or 4 digit number to an informational page about the apt. that is generated by resident-info.pl. It works wee, except sometimes people update the log and enter a date in the log itself. When that happens the year is then tagged as a unit number when it shouldn't be. Also, there have been times when someone has included HTML tags in the log entry, and perhaps a URL that includes a number that is not a unit number. The code above then isolates part of that HTML and tries to place a new URL with bold tags around the number in the manually entered HTML. That screws things up. I hope that's clear.

Replies are listed 'Best First'.
Re^5: Replacing 3 and 4 digit numbers.
by AnomalousMonk (Archbishop) on Apr 10, 2016 at 23:21 UTC

    One approach might be to extend the SKIP-FAIL trick for excluding URLs to also exclude dates:
        $text =~ s{ (?: $url | $date) (*SKIP) (*FAIL) | ($digits_3_4) }{...}xmsg

    Of course, this leaves you with the headache of trying to define a regex to match every possible format of date that a human bean might imagine. Here's a start, but please be aware that this code is untested and also that the  $date regex does not nearly cover every possible permutation of day-month-year ordering or the many internal separator sequences that might be used; you will have to extend this defnition as needed. (Also note that the  $yr pattern is limited to the 21st century.)

    (This regex is 5.8.9 compatible.) The first thing you will want to do is write a Test::More script to test your  $date regex against every possible date format you've ever encountered and any others you can imagine.

    BTW: You have never said if your version of Perl is 5.10 or later, so I don't even know if the SKIP-FAIL trick is possible for you. What version of Perl are you using?


    Give a man a fish:  <%-{-{-{-<