htmanning has asked for the wisdom of the Perl Monks concerning the following question:

Monks, I asked a question awhile ago about detecting numbers in a field variable. So I'm using the following. It's not perfect, but it works.
my @numbers4 = $text =~ /\b \d{4} \b/gx; foreach $unit4 (@numbers4) { $text =~ s/$unit4/\<a href=\"script.pl?do=view&unit=$unit4\"\>\<b\>$un +it4\<\/b\>\<\/a\>/i; }
It doesn't parse out phone numbers, but oh well. The issue I'm having is if someone enters the same unit number in the field multiple times, the replace line screws up the URL of the first instance because the unit number appears in the URL It ends up liek this:
<a href="script.pl?do=view&unit=<a href="script.pl?do=view&unit=2003"> +<b>2003</b>
How can I work around this?

Replies are listed 'Best First'.
Re: Detecting numbers continued
by AnomalousMonk (Archbishop) on Apr 24, 2015 at 01:20 UTC

    You give no example of a failing  $text string in your most recent post nor in the previous ones (Pull 3-digit and 4-digit numbers from string, Recognizing numbers and creating links and Matching text in a string), but the advice offered is the same: do a single substitution. File s_4_digits_1.pl:

    use warnings; use strict; my $text = qq{foo 1234 \n 1234 bar \n1234\n 1 22 333 55555 666666}; print qq{[[$text]] \n}; my $digits_4 = qr{ \b \d{4} \b }xms; $text =~ s{ ($digits_4) } {<a href="script.pl?do=view&unit=$1"><b>$1</b></a>}xmsg; print qq{[[$text]] \n};
    Output:
    c:\@Work\Perl\monks\htmanning>perl s_4_digits_1.pl [[foo 1234 1234 bar 1234 1 22 333 55555 666666]] [[foo <a href="script.pl?do=view&unit=1234"><b>1234</b></a> <a href="script.pl?do=view&unit=1234"><b>1234</b></a> bar <a href="script.pl?do=view&unit=1234"><b>1234</b></a> 1 22 333 55555 666666]]


    Give a man a fish:  <%-(-(-(-<

      Thank you so much for this. It works! This still tags 4-digit numbers as part of a phone number, but that's okay. I'd love to hire you for several projects as I'm just a hack.

        I posted this solution in your previous thread, but maybe I was late and you missed it. It does the job while skipping phone numbers. If you don't understand any of it, just ask.

        #!/usr/bin/env perl use 5.010; use strict; use warnings; my $text = <<END; Apt.302. 123-4567. 4021 207-555-1531 Apt. 987 END $text = " $text "; # pad it so my regex will work if a number # appears at the beginning or end of line $text =~ s|( # start capturing in $1 [^\d-] # any single character other than a digit or dash ) # end capture of $1 ( # start capturing in $2 \d\d\d # three digits in a row \d? # zero or one digits (for a total of 3-4) ) # end capture of $2 ( # start capture of $3 [^\d-] # any single character other than a digit or dash ) # end capture of $3 |$1<a href="\?unit=$2"><b>$2</b></a>$3|gx; # Replace captured # sections with HT +ML added $text = substr($text, 1,-1); # trim off padding say $text;

        Aaron B.
        Available for small or large Perl jobs and *nix system administration; see my home node.

        You're very welcome. I'm not in the job market right now. Have you tried Perl Jobs?

        Another alternative is this site, where you can often find copious help for free! As aaron_baugher has alluded, several approaches to the phone number exclusion problem have already been given in answers to your previous posts. To get the most out of this site, you must be willing to give useable input and output data in addition to clear descriptions of your problem and, one hopes, your efforts at a solution. I.e., you must help us to help you.


        Give a man a fish:  <%-(-(-(-<

Re: Detecting numbers continued
by Anonymous Monk on Apr 24, 2015 at 00:20 UTC
    Dont make regex substitution in a loop, use quotemeta and join to createa single regex from @numbers4
      my @nums = qw/ 3 9 2 9 /;; { my $re = join '|', map { "\Q$_\E" } @nums; my $str = 'and a 3 9 2 9 '; $str =~ s{($re)}{"$1"}g; print "$str\n"; } { my $re = join '\s*', map { "\Q$_\E" } @nums; my $str = 'and a 3 9 2 9 '; $str =~ s{($re)}{"$1"}g; print "$str\n"; } __END__ and a "3" "9" "2" "9" and a "3 9 2 9"