AF_One has asked for the wisdom of the Perl Monks concerning the following question:

Wise Monks,

I'm having some trouble with a RegEx that matches within a <coordinates> (KML) tag...

$kmlfile = "<coordinates>1.0001,2.0002,0 3.0003,4.0004,0 ...</coordina +tes>"; if ($kmlfile =~ m{<coordinates>(.*)</coordinates>}g) { $coordinates = $1; while ($coordinates =~ m/\G(\d*\.\d*,\d*\.\d*),0*/s*/gs) { push @coords, $1; } }

The only problem with this is that @coords only gets the first pair (1.0001,2.0002).

How do I populate each element with pairs as shown below? I thought that the \G operator would hold the previous search's last position and start from there on the next iteration to fill the array.

$coords[0] = 1.0001,2.0002; $coords[1] = 3.0003,4.0004; ...

Your insight is most appreciated.

Replies are listed 'Best First'.
Re: \G inline RegEx operator
by almut (Canon) on Jun 03, 2008 at 22:42 UTC
    m/\G(\d*\.\d*,\d*\.\d*),0*s*/gs ^

    I think you just have a typo in your regex: s should be \s

      Thanks almut, I lost the "\" in translation, but it's presence still doesn't solve my problem...

      Highest Regards

        Hm, works fine for me... i.e. when I add

        ... use Data::Dumper; print Dumper \@coords;

        to your code as is (but with the \s fixed), I do get

        $VAR1 = [ '1.0001,2.0002', '3.0003,4.0004' ];
Re: \G inline RegEx operator
by johngg (Canon) on Jun 03, 2008 at 23:17 UTC
    I don't think that the \G is really necessary.

    use strict; use warnings; use Data::Dumper; my $kmlfile = <<'EOD'; <coordinates>1.0001,2.0002,0 3.0003,4.0004,0 5.0005,6.0006,0 7.0007,8.0008,0 9.0009,10.0010,0 </coordinates> EOD my $rxCoord = qr {(?x) (?<=\s|>) (\d*\.\d*,\d*\.\d*) (?=,0) }; my @coords = $kmlfile =~ m{$rxCoord}g; print Data::Dumper->Dumpxs( [ \ @coords ], [ q{*coords} ] );

    The output.

    @coords = ( '1.0001,2.0002', '3.0003,4.0004', '5.0005,6.0006', '7.0007,8.0008', '9.0009,10.0010' );

    I hope this is of use.

    Cheers,

    JohnGG

Re: \G inline RegEx operator
by pc88mxer (Vicar) on Jun 03, 2008 at 23:23 UTC
    Your method is fine, but here's how I would first approach this problem. It's based on some assumptions about how the coordinate data looks, so it may not be usable for you. However, I think the process of developing it is simpler, and perhaps it's even a little more robust.
    if ($kmlfile =~ m{<coordinates>(.*?)</coordinates>}) { my @triples = split(' ', $1); for (@triples) { my ($lat, $long) = split(',', $_); push(@coords, [$lat, $long]); # or whatever } }
    It's a little more robust in that if there is non-conforming data it will be obvious (like one of your coordinates will be undef or contain junk.) Regex approaches have a tendency to just stop giving you results without any hint that there's a problem. That is, when they stop matching you're not sure if it's because there's nothing more to match or if there's a problem with your regex.
Re: \G inline RegEx operator
by AF_One (Novice) on Jun 03, 2008 at 23:34 UTC

    Thanks so much guys, it's great to have this kind of support as a beginner...