Melroch has asked for the wisdom of the Perl Monks concerning the following question:

This is a bit of a contrieved example, because what I really want to do is a bit complicated, but it illustrates the problem I'm having. Also neither English nor Perl is my first language ;) so be forgiving!

Say I want to insert a numeral "5" between certain characters, so I write:

s/([a-m])([n-z])/$15$2/g;

Now this doesn't work: Perl thinks I'm talking about a variable $15 here and I get an error message. To work around I define a variable $five = 5; and redefine the s/// as

s/([a-m])([n-z])/$1$five$2/g;

Now it works but is there a way to get it to work without that $five variable?

TIA,

/Melroch

Replies are listed 'Best First'.
Re: Inserting numbers between parenthese matches in regexp
by Happy-the-monk (Canon) on Jun 27, 2004 at 17:01 UTC

    try curly braces around $1, making it ${1} - now perl finds it unambiguous from meaning $15:

    s/([a-m])([n-z])/${1}5$2/g;

    Cheers, Sören

Re: Inserting numbers between parenthese matches in regexp
by dws (Chancellor) on Jun 27, 2004 at 17:17 UTC

    A second approach to the one mentioned above is to exploit the power of the /e modifier, which turns the right-hand side of the substition into a first-class expression. Then you could do something like:

    s/([a-m])([n-z])/$1 . myfunction($1, $2) . $2/eg; sub myfunction { my ($pre, $post) = @_; # we could do something smart with the pre- and post- # strings, but instead we'll just return a 5. return 5; }
Re: Inserting numbers between parenthese matches in regexp
by Joost (Canon) on Jun 27, 2004 at 17:02 UTC
Re: Inserting numbers between parenthese matches in regexp
by BrowserUk (Patriarch) on Jun 27, 2004 at 19:25 UTC

    If you do it this way, you both avoid the problem and the need for capturing parens, so it's more efficient to boot.

    s[(?<=[a-m])(?=[n-z])][5]g;

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon

      I'm curious. Question of style, why the square braces as delimiters? There are no slashes in what is matched, so there's no problem regarding leaning toothpicks.

      I had to think for an extra second or two to figure out what you were doing, which in fact is quite clever. Bear with me, I want to see what this looks like:

      s/(?<=[a-m])(?=[n-z])/5/g;

      I dunno, somehow that seems clearer to me. I think it is because otherwise I find my eyes skipping back and forward too much between the character classes and the s-op delimiters. (But ++ all the same, I'll keep this technique in mind).

      - another intruder with the mooring of the heat of the Perl

        Basically, I got fed up with writing some regexes and substitutions with /s and then having to choose a different delimiter to avoid leaning toothpicks, then a different delimiter somewhere else to avoid conflicts with the fiorst choice and so on.

        Then I discovered that using balanced delimiters meant that I rarely had to switch delimiters. Of the choices, () are just too common in regex, {} look like code blocks and are also used in regex.

        I tried <> for a while, and they are a pretty good choice, but of the four, despite that they are themselves fairly common in regex, I found that I preferred []. So I now use them for all my regexes. I think I only encountered one time when I had problems with using them and that waas in a monster regex attempting to parse XML.

        Historically, I am a strong believer in consistancy, and being able to use the same delimiter for all my regex (and other quote-like constructs) just makes my code more self-consistant.

        Personally, I can't wait for characters to become consistantly 32-bits. Not only would that do away with the variable byte-length encoding problem that comes from utf and the performance hit that entails, but it gives more than enough space in the character set to introduce a dozen or so more sets of balanced pairs that would allieviate much of this type of problem completely.

        I noticed that Bob Bemer, the father of ASCIIdied recently. Maybe it's time to let ASCII go to and invent a completely new set of symbols for computing:)


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon