Inserting numbers between parenthese matches in regexp

Melroch has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Inserting numbers between parenthese matches in regexp by Happy-the-monk (Canon) on Jun 27, 2004 at 17:01 UTC
try curly braces around `$1`, making it `${1}` - now perl finds it unambiguous from meaning `$15`: `s/([a-m])([n-z])/${1}5$2/g;` Cheers, Sören	[reply]
Re: Inserting numbers between parenthese matches in regexp by dws (Chancellor) on Jun 27, 2004 at 17:17 UTC
A second approach to the one mentioned above is to exploit the power of the `/e` modifier, which turns the right-hand side of the substition into a first-class expression. Then you could do something like: `s/([a-m])([n-z])/$1 . myfunction($1, $2) . $2/eg; sub myfunction { my ($pre, $post) = @_; # we could do something smart with the pre- and post- # strings, but instead we'll just return a 5. return 5; }` [download]	[reply] [d/l] [select]
Re: Inserting numbers between parenthese matches in regexp by Joost (Canon) on Jun 27, 2004 at 17:02 UTC
The replacement part of a regex acts like a double quoted string as far as variable interpolation is concerned, so this should work (untested): `s/([a-m])([n-z])/${1}5$2/g; # ^^^^` [download] "What should it profit a man, if he should win a flame war, yet lose his cool?"	[reply] [d/l]
Re: Inserting numbers between parenthese matches in regexp by BrowserUk (Patriarch) on Jun 27, 2004 at 19:25 UTC
If you do it this way, you both avoid the problem and the need for capturing parens, so it's more efficient to boot. `s[(?<=[a-m])(?=[n-z])][5]g;` [download] Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon	[reply] [d/l]
Re:x2 Inserting numbers between parenthese matches in regexp by grinder (Bishop) on Jun 27, 2004 at 21:26 UTC
I'm curious. Question of style, why the square braces as delimiters? There are no slashes in what is matched, so there's no problem regarding leaning toothpicks. I had to think for an extra second or two to figure out what you were doing, which in fact is quite clever. Bear with me, I want to see what this looks like: `s/(?<=[a-m])(?=[n-z])/5/g;` I dunno, somehow that seems clearer to me. I think it is because otherwise I find my eyes skipping back and forward too much between the character classes and the s-op delimiters. (But ++ all the same, I'll keep this technique in mind). - another intruder with the mooring of the heat of the Perl	[reply] [d/l]
Re^2: x2 Inserting numbers between parenthese matches in regexp by BrowserUk (Patriarch) on Jun 27, 2004 at 22:03 UTC
Basically, I got fed up with writing some regexes and substitutions with /s and then having to choose a different delimiter to avoid leaning toothpicks, then a different delimiter somewhere else to avoid conflicts with the fiorst choice and so on. Then I discovered that using balanced delimiters meant that I rarely had to switch delimiters. Of the choices, () are just too common in regex, {} look like code blocks and are also used in regex. I tried <> for a while, and they are a pretty good choice, but of the four, despite that they are themselves fairly common in regex, I found that I preferred `[]`. So I now use them for all my regexes. I think I only encountered one time when I had problems with using them and that waas in a monster regex attempting to parse XML. Historically, I am a strong believer in consistancy, and being able to use the same delimiter for all my regex (and other quote-like constructs) just makes my code more self-consistant. Personally, I can't wait for characters to become consistantly 32-bits. Not only would that do away with the variable byte-length encoding problem that comes from utf and the performance hit that entails, but it gives more than enough space in the character set to introduce a dozen or so more sets of balanced pairs that would allieviate much of this type of problem completely. I noticed that Bob Bemer, the father of ASCIIdied recently. Maybe it's time to let ASCII go to and invent a completely new set of symbols for computing:) Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon	[reply] [d/l] [select]