gnieddu has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I'm practicing on regular expressions. I've tried using the /g switch as in the following piece of code:

$string = "<name=\"foo\"><anystring 1 /></name><name=\"bar\"><anystrin +g 2 /></name>"; while ($string =~ m/<name=\"(.*)\">(.*)<\/name>/g) { print "$1, $2\n"; }

The idea was to get two lines like:

foo, <anystring 1 />

bar, <anystring 2 />

Instead I get a single line like:

foo"><anystring 1 /></name><name="bar, <anystring 2 />

Can anybody explain me why, and how can I modify the code in order to get what I want?

Thanks

Replies are listed 'Best First'.
Re: Can't understand why /g does not work as I expect
by toolic (Bishop) on Mar 19, 2010 at 12:56 UTC
    The /g modifier is working just fine, but your regex is too greedy (see perlreref). Change .* to .*?
    $string = "<name=\"foo\"><anystring 1 /></name><name=\"bar\"><anystrin +g 2 /></name>"; while ($string =~ m/<name=\"(.*?)\">(.*?)<\/name>/g) { print "$1, $2\n"; } __END__ foo, <anystring 1 /> bar, <anystring 2 />
    If this is XML, I recommend using a parser instead of regular expressions. It will save you a lot of trouble.

    As an aside, you could avoid excessive back-whacking in your string by using a different quote character. For example, switch to single quotes. In my opinion, this makes the code easier to understand and maintain.

    $string = '<name="foo"><anystring 1 /></name><name="bar"><anystring 2 +/></name>';

    Similarly, in your regex, there is no need to escape ", nor is there a need to escape / if you switch to a different delimiter, such as {}

    while ($string =~ m{<name="(.*?)">(.*?)</name>}g)

      Hi folks,

      thanks to all of you for your really useful help!

Re: Can't understand why /g does not work as I expect
by ww (Archbishop) on Mar 19, 2010 at 13:18 UTC

    The issue is not so much any obvious misunderstanding of m//g as failure to consider greediness. Read about that, using perldoc perlretut or your own choice of reference, and note the limit on greed (in the form of a "?" after each "*" in the code below.

    #!/usr/bin/perl use strict; use warnings; #829607 =head The idea was to get two lines like: foo, <anystring 1 /> bar, <anystring 2 /> =cut my $string = "<name=\"foo\"><anystring 1 /></name><name=\"bar\"><anyst +ring 2 /></name>"; while ($string =~ m/<name=\"(.*?)\">(.*?)<\/name>/g) { print "$1, $2\n"; }

    Output:

    ww@GIG:~/pl_test$ perl 829607.pl foo, <anystring 1 /> bar, <anystring 2 /> ww@GIG:~/pl_test$
Re: Can't understand why /g does not work as I expect
by Anonymous Monk on Mar 19, 2010 at 13:09 UTC
    To see what is going on, use very short sample input and
    use re 'debug';