...
s/$1/$first/g;
...
s/$2/$second/g;
...
s/$3/$third/g;
...
s/$4/$fourth/g;
...
The critical thing to realize about this code is that the capture variables $2 $3 $4 have never been set to any meaningful value. I.e., they have the undefined value undef. When the undefined value is interpolated into a string or a regex, it interpolates as '' (the empty string), or, in the case of a regex, // (the empty regex).
...
/([A-Z][a-z]+)/g;
s/$2/$second/g;
...
This pair of statements and corresponding succeeding statement pairs is very interesting. I strongly recommend you insert the statement
print qq{=== '$_' \n}; # FOR DEBUG
or its equivalent after each and every of the s/// substitution statements to monitor what's going on with the progressive 'correction' of the initial string.
Here's a narrative. As you can see from the newly-added debug print statement, the first
/([A-Z][a-z]+)/g;
s/$1/$first/g;
statement pair actually does something expected and useful: it replaces the first number with 'One'. The output from the debug print statement is
=== 'One, Four, One, Two'
The second
/([A-Z][a-z]+)/g;
s/$2/$second/g;
statement pair replaces all numbers with 'Two'! The output from the debug print statement is
=== 'Two, Two, Two, Two'
The reason for this odd behavior is that when $2 with an undefined value interpolates into s/$2/$second/g; it produces the // empty regex match pattern. This pattern is special: it uses the last successful regex match pattern for matching. The last successful match pattern was in the /([A-Z][a-z]+)/g; statement immediately before the s/// substitution statement. Therefore,
s/$2/$second/g;
interpolates (ignoring, as you do, the warning message) as if it were
s//$second/g;
which matches as if it were
s/([A-Z][a-z]+)/$second/g;
which replaces each and every match (because of the /g modifier) against the ([A-Z][a-z]+) pattern (i.e., something that looks like a number) with, in this case, 'Two'. Whew!
And similarly for each subsequent //; s///; statement pair.
That ought to give you something to think about while you're reviewing the regex documentation.
(BTW: The /g modifier in the /([A-Z][a-z]+)/g; statement is at best useless and at worst confusing and corrupting. You cannot use the /g modifier in this way to "keep track" of match positions in successive matches. (The /c modifier in conjunction with the /g modifier does something like this in certain cases, but I don't really see how it could be adapted to serve here.) You will have to think of some other way to query the user about successive numbers in the original string so that they may be 'corrected' one by one.)
In reply to Re^5: RegExp substitution
by AnomalousMonk
in thread RegExp substitution
by Keystone
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |