Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to search a string for all instances that match

size="10">##iliX

where X is a 1 or 2 digit number, and replace them with

size="10" text-anchor="end">##iliX

I've tried $text =~ s/size="10">##ili(\d{1,2})/size="10" text-anchor="end">##ili$1/g; but that doesn't seem to work.

What's the correct regex?

Replies are listed 'Best First'.
Re: Simple regex help
by Marshall (Canon) on Jan 13, 2011 at 19:18 UTC
    You are pretty close. Here is what I came up with. Capture the variable part (the numbers) so that you can use the same value in the replacement.
    #!/usr/bin/perl -w use strict; my $dat = 'size="10">20iliX 234bvc 234yyyyy size="10">35iliX'; $dat =~ s/size="10">(\d{1,2})iliX/size="10" text-anchor="end">$1iliX/g +; print $dat; #size="10" text-anchor="end">20iliX 234bvc 234yyyyy size="10" text-anc +hor="end">35iliX
    Ooops! now I see that the variable is the X and not the ##. That makes things slightly different, capture the X digits instead of "##'
    my $dat = 'size="10">##ili22 234bvc 234yyyyy size="10">##ili33'; $dat =~ s/size="10"\>##ili(\d{1,2})/size="10" text-anchor="end">##ili$ +1/g; print $dat; #size="10" text-anchor="end">##ili22 234bvc 234yyyyy size="10" text-an +chor="end">##ili33
    Insert: Another Oops, saw post from ww. Can you show data from real thing? My simple example works, but obviously something more complex is going on.
Re: Simple regex help
by AnomalousMonk (Archbishop) on Jan 13, 2011 at 20:55 UTC

    [Anonymonk]: In common with previous respondents, I can't understand your problem (just what does "doesn't seem to work" mean?) with the code given in the OP: it does just what I understand you want done. Please see example below. (I use 'H' in place of '#' because my little command line editor thinks all #s are comments. Also, I give an alternate approach that I think might be more flexible. It could even be done without captures, but let's not get ahead of ourselves.)

    >perl -wMstrict -le "my $text = 'foo size=\"10\">HHili1 bar size=\"10\">HHili22 baz'; print qq{'$text'}; ;; $text =~ s/size=\"10\">HHili(\d{1,2})/size=\"10\" text-anchor=\"end\" +>HHili$1/g; print qq{'$text'}; ;; my $toxt = 'fee size=\"10\">HHili1 fie size=\"10\">HHili22 foe'; print qq{'$toxt'}; ;; my $pre = qr{ size=\"10\" }xms; my $post = qr{ >HHili\d{1,2} }xms; my $insert = ' text-anchor=\"end\"'; $toxt =~ s{ ($pre) ($post) }{$1$insert$2}xmsg; print qq{'$toxt'}; " 'foo size="10">HHili1 bar size="10">HHili22 baz' 'foo size="10" text-anchor="end">HHili1 bar size="10" text-anchor="end +">HHili22 baz' 'fee size="10">HHili1 fie size="10">HHili22 foe' 'fee size="10" text-anchor="end">HHili1 fie size="10" text-anchor="end +">HHili22 foe'
Re: Simple regex help
by ww (Archbishop) on Jan 13, 2011 at 19:24 UTC
    "...doesn't seem to work?" How? Error message. Output not as desired? What?

    Insert: Blargh. misread OP. In any case, one possibility is:

    $string =~ s/size="10">##ili(\d{1,2})/size="10" text-anchor="end">##ili$1/;

    Update: Is your problem with using /g (i.e., trying to make multiple replacements) in a substitution? You'll find numerous nodes on this "'gotcha" with Super Search.

Re: Simple regex help
by elef (Friar) on Jan 14, 2011 at 17:15 UTC
    Me too... I thought you had some escaping problem here, but no. The code in the OP works as intended.

    while (<DATA>) { s/size="10">##ili(\d{1,2})/size="10" text-anchor="end">##ili$1/g; print; } __DATA__ size="10">##ili7 size="10">##ili56 size="10">##iliad


    As expected, this prints out
    size="10" text-anchor="end">##ili7
    size="10" text-anchor="end">##ili56
    size="10">##iliad

    I.e. it does the replacements on the first two lines, and skips the third as it has a letter instead of a number after the ##ili. I'm not sure what your problem is, perhaps your data isn't as uniform as you thought it was: the source file has variants of the string with whitespace etc. and some of these are not matched by the regex.