rsriram has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have tag, which needs to be sequentially numbered. The text goes like this...

<exm>This is example 1 <exm>This is example 2 <exm>This is example 3 <exm>This is example 4

I need to number the <exm> element sequentially until they are sequential.

<exm num="1">This is example 1 <exm num="2">This is example 2 <exm num="3">This is example 3 <exm num="4">This is example 4

If the same set of elements appear in a different part of the text, the number needs to restart.

The code I wrote to achieve is:

$_ =~ s/<exm>/'<exm num="'.$exmno++.'">'/egs; if ($_ != /<exm>/g) {$exmno=0}
I have the content stored in $_. I don't know why, but I am not able to get the numbers reset if the line does not encounter a <exm> tag. Can anyone help me out?

Replies are listed 'Best First'.
Re: Restarting counters in text
by davorg (Chancellor) on Aug 10, 2006 at 11:16 UTC
    if ($_ != /<exm>/g)

    You need to go back and have another read of perlop.

    You're saying "if $_ isn't numerically equal to the value returned when you try to match /<exm>/ against $_". What you're trying to say is "if $_ doesn't match /<exm>/". That's written as:

    if ($_ !~ /<exm>/)

    or (perhaps easier to understand)

    if (!/<exm>/)

    When all else fails, it's always worth reading the documentation for the syntax you're trying to use :-)

    --
    <http://dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Re: Restarting counters in text
by rodion (Chaplain) on Aug 10, 2006 at 12:28 UTC
    And a one-liner version is
    $exmno=0 unless s/<exm>/'<exm num="'.++$exmno.'">'/egs;
    Also note that in the OP, as here, the "++" goes before the "$exmno", otherwise you get
    <exm num="0">This is example 1
    instead of
    <exm num="1">This is example 1

    I assume you want the numbers to match.

Re: Restarting counters in text
by Jasper (Chaplain) on Aug 10, 2006 at 11:27 UTC
    further to the other comment (correcting != to !~, I guess), if you have the whole content stored and sub out all instances of <exm>, then the second resetting line will never happen.

    You need to do it line by line, and s///s does the whole thing at once.

    edit This is utter nonsense, of course. What I meant to say was:

    then the second resetting line will ALWAYS happen.

    I don't know what way these lines are coming in - whether he's getting blocks of code with exm's, and never gets a second line containing exms that he wants to be contiguous. I've no idea, but I had sort of confused myself a bit.

    Sorry for the late edit
      Hi Further to the above comments. Here is the code that might work for you.
      $_ =~ s/<exm>/'<exm num="'.$exmno++.'">'/egs; if ($_ !~ /^<exm.*?$/) { $exmno=0; }
      Refer for the syntax rules before any thing fails :-) -Dinakar

      Edited - Added code tags (davorg)

        Please use <code> tags to make your code easier to read.

        if ($_ !~ /^<exm.*?$/)

        The ".*?$" on the end of that regex is pointless.

        --
        <http://dave.org.uk>

        "The first rule of Perl club is you do not talk about Perl club."
        -- Chip Salzenberg

Re: Restarting counters in text
by Moron (Curate) on Aug 10, 2006 at 11:37 UTC
    (updated) In your "if" construct, the "!=" relationship is coercing both sides to numeric. The effect on the if statement is that it is equivalent to /^(\d+)/ and ( $1 ) and $exmno=0; which is probably not what you wanted. If the line were stored in a declared variable, what you would want instead would be
    if ( $variable !~ /^<exm>/ ) { $exmno = 0 );
    but because the variable to match is $_, this is the default variable for the match operator and it is therefore sufficient to write:
    if ( !/^<exm>/ ) { $exmno = 0 };
    or even shorter is
    /^<exm>/ or $exmno = 0;
    Two further points are that 1) this test is anyway redundant because s// returns false for non-match and can be used directly to drive the reset to zero and 2) if the counter is zero until it encounters an exm line, and if the first exm line in a group is number 1, then the counter needs to be pre- rather than post-incremented, e.g.:
    my $exmno = 0; while(<>) { s/^<exm>/'<exm num ='.++$exmno.'>'/ or $exmno = 0; print $_; }

    -M

    Free your mind

Re: Restarting counters in text
by sh1tn (Priest) on Aug 10, 2006 at 13:16 UTC
    use strict; use warnings; $_ =<<SQ; <exm>This is example 1 <exm>This is example 2 <exm>This is example 3 <exm>This is example 4 SQ s/<exm>/'<exm num="'.++$_.'">'/eg; print;
    or just take the line number from the string:
    s/<exm>(?=.+?example\s+(\d+))/<exm num="$1">/g;