Unfortunately, that doesn't seem to me to eliminate the problem entirely.
Win8 Strawberry 5.8.9.5 (32) Tue 09/13/2022 15:47:56
C:\@Work\Perl\monks
>perl
use strict;
use warnings;
# use Data::Dump qw(dd); # for debug
my $unitslist = join "|", qw(
1001 1002 1003 1004 1101 1102 1103 1104 1201 1202 1203 1204 1301 130
+2
1303 1304 1401 1402 1403 1404 1501 1502 1503 1504 1601 1602 1603 160
+4
1701 1702 1703 1704 1801 1802 1803 1804 1901 1902 1903 1904 2001 200
+2
2003 2004 2101 2102 2103 2104 2201 2202 2203 2204 2301 2302 2303 230
+4
2401 2402 2403 2501 2502 2503 2504 2505
);
my $text =
"Visit units 1101 or 2202, call us at 555-555-2202 and
call before 13 Aug, 2202\n";
$text =~ s{ \b($unitslist)\b }
{\n<a href="apartments.pl?do_what=view&unit=$1"><b>$1</b></a
+>\n}xg;
print $text;
^Z
Visit units
<a href="apartments.pl?do_what=view&unit=1101"><b>1101</b></a>
or
<a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a>
, call us at 555-555-
<a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a>
and
call before 13 Aug,
<a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a>
Indeed, it doesn't seem as if the problem can be entirely eliminated unless input text can be specified to be much more specialized. E.g., uniquely delimit all unit number sub-strings:
%1234% or
{{1234}}. This would also allow for easy support of unit numbers like
123A or
12-B.
It's possible to somewhat mitigate the problems associated with completely free-form text by adding more boundary conditions.
Win8 Strawberry 5.8.9.5 (32) Wed 09/14/2022 0:21:27
C:\@Work\Perl\monks
>perl
use strict;
use warnings;
# use Data::Dump qw(dd); # for debug
my ($rx_all_units) =
map qr{ (?<! [-.:]) \b (?: $_) \b (?! [-.:]) }xms,
join '|',
reverse sort
qw(
1001 1002 1003 1004 1101 1102 1103 1104 1201 1202 1203 1204 1301 1
+302
1303 1304 1401 1402 1403 1404 1501 1502 1503 1504 1601 1602 1603 1
+604
1701 1702 1703 1704 1801 1802 1803 1804 1901 1902 1903 1904 2001 2
+002
2003 2004 2101 2102 2103 2104 2201 2202 2203 2204 2301 2302 2303 2
+304
2401 2402 2403 2501 2502 2503 2504 2505
);
my $text =
"Visit units 1101 or 2202, call us at 555-555-2202 and
call before 13 Aug, 2202\n";
$text =~ s{ ($rx_all_units) }
{\n<a href="apartments.pl?do_what=view&unit=$1"><b>$1</b></a
+>\n}xg;
print $text;
^Z
Visit units
<a href="apartments.pl?do_what=view&unit=1101"><b>1101</b></a>
or
<a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a>
, call us at 555-555-2202 and
call before 13 Aug,
<a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a>
Give a man a fish: <%-{-{-{-<