Ananda has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks

I am trying a regex to extrace a specifig piece string using the below.

my $string = '<School type="content" name="Schoolpage"><section name=" +Graduation"><grade name="Iyear"><!-- Set description for BBCA --><des +cription>'; if($string=~ m/(<section (.*)">?)/) { print "Here :: $1 \n"; }else {print "regex failed";}

Output of the above is :::

Here :: <section name="Graduation"><grade name="Iyear">

What I want is '<section name="Graduation">'.

Please advice.

Thanks in advance

Ananda

Replies are listed 'Best First'.
Re: Regex help reqd
by prasadbabu (Prior) on Jun 27, 2005 at 06:38 UTC

    .* leads to greediness. You must use .*? to avoid the greediness, to match minimum.

    You can do this in many ways, like gopal suggested, you can use [^>]+ or you can use (.*?)

    Go through perlre.

    if($string=~ m/<section [^>]+>/) { print "Here :: $& \n"; } else { print "regex failed"; } or if($string=~ m/<section .*?>/) { print "Here :: $& \n"; } else { print "regex failed"; }

    But first option is the safer solution.

    Prasad

      Thanks Gopal and Prasad. That was of great help.

      Ananda
      As an aisle, how will "</section>" be tested?
        This will do the trick

        if($string=~ m/<\/section>/) { print "Here :: $& \n"; } else { print "regex failed"; }
Re: Regex help reqd
by gopalr (Priest) on Jun 27, 2005 at 06:25 UTC
    my $string = '<School type="content" name="Schoolpage"><section name=" +Graduation"><grade name="Iyear"><description>'; if($string=~ m/<section [^>]+>/) { print "Here :: $& \n"; } else { print "regex failed"; }

    Output:

    Here :: <section name="Graduation">