in reply to Regex: remove non-adjacent duplicate hashtags

$tags=~/(#\S+).+$1/;

Why is it a bad idea to compile a capture variable, e.g., $1, into a regex as haukex and LanX have noted?

Capture variables have the values they were assigned on execution of the most recent | most recent successful m// or s/// match, or undef if no match | no successful match has ever been done.

Win8 Strawberry 5.8.9.5 (32) Sat 07/23/2022 16:42:27 C:\@Work\Perl\monks >perl use strict; use warnings; use Data::Dump qw(dd); dd 'A: $1', $1; my $rx = qr/(#\S+).+$1/; print "B: $rx \n"; 'OOPS' =~ /(OOPS)/; $rx = qr/(#\S+).+$1/; print "C: $rx \n"; ^Z ("A: \$1", undef) Use of uninitialized value in concatenation (.) or string at - line 8. B: (?-xism:(#\S+).+) C: (?-xism:(#\S+).+OOPS)

In this example, $1 is undefined at A because no match that assigned a defined value to it has ever been done.

At B, a regex is compiled using the undefined $1. Compilation produces an "uninitialized value" warning, and the compiled regex has nothing where $1 was. This is because an undefined value is stringized as the empty string. This is a good example of why warnings (and strict!) should always be enabled. You make no mention of any warning message, so I assume you did not do this. I turn the arched eyebrow of scorn upon you.

At C, the same regex is compiled again after $1 has been given a defined value by a match. The result in this case is a potentially very problematic bug. Note that no warning message is generated! Good luck with this one.

The take-away: always use warnings and strict.


Give a man a fish:  <%-{-{-{-<