EchoAngel has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, this is my code
my @DATA = ( ' date : "April 27, 2004";', ' comment : "Copyright (c) 2002 FoodNation Technolo, Inc. ";', ' power_watts : "1pC";', ' fruits_vegs_food (1.0, pound);' ); my $inline; foreach $inline (@DATA) { if ($inline =~ m/\s*(\w+)\s*:*\"*\(*\s*([\S\s]+)\"*\)*;*/) { print "FIRST TERM - $1\n"; print "SECON TERM - $2\n"; } }
this isn't what i expected as output . this is the output :
FIRST TERM - date SECON TERM - "April 27, 2004"; FIRST TERM - comment SECON TERM - "Copyright (c) 2002 FoodNation Technolo, Inc. "; FIRST TERM - power_watts SECON TERM - "1pC"; FIRST TERM - fruits_vegs_food SECON TERM - 1.0, pound);
what i really wanted was
FIRST TERM - date SECON TERM - April 27, 2004 FIRST TERM - comment SECON TERM - Copyright (c) 2002 FoodNation Technolo, Inc. FIRST TERM - power_watts SECON TERM - 1pC FIRST TERM - fruits_vegs_food SECON TERM - 1.0, pound
do u know what's wrong with my matching expressions?

Replies are listed 'Best First'.
Re: Perl Problems with Matching Expression Patterns
by Eimi Metamorphoumai (Deacon) on Jan 12, 2005 at 21:11 UTC
    At the begining of each second item, you're looking for an optional quote, an optional (, and then an optional space. But since the space comes first, the quote goes into your $2. Secondly, [\S\s] matches everything, so it's equivalent to ., so it's consuming everything to the end of the line (it's greedy), and leaving nothing for your trailing modifiers. I think if I had it to write, here's how I would do it.
    use strict; use warnings; while (<DATA>) { if (/^\s*(\w+)[\s:"(]+(.+?)[\s");]*$/) { print "FIRST TERM - $1\n"; print "SECON TERM - $2\n"; } } __DATA__ date : "April 27, 2004"; comment : "Copyright (c) 2002 FoodNation Technolo, Inc. "; power_watts : "1pC"; fruits_vegs_food (1.0, pound);
    that is, you start with optional whitespace, then your first term, then some delimiter, then the rest.
      wow, your right , \S is eatting everything, thanks
Re: Perl Problems with Matching Expression Patterns
by Zaxo (Archbishop) on Jan 12, 2005 at 21:15 UTC

    You have records with two fields seperated by colon. You can just split on that to get the arrangement you want. Putting that in a hash is attractive, but doesn't preserve order, so lets make it an array of arrays instead. You probably want to trim leading and trailing whitespace, too.

    for (@DATA) { # this transforms @DATA, destructive $_ = [map {/^\s*(.*?)\s*$/} split ':', $_, 2]; }
    The split on colon does the field seperation, then the mapped match trims leading and trailing space. Square brackets make the result an array reference.

    To print the new @DATA in the form you want,

    for (@DATA) { printf "FIRST TERM - %s\nSECOND TERM - %s\n", @$_; }

    Update: holli++, I did miss that. We do see what we expect, don't we? Second try,

    for (@DATA) { $_ = join ' ', split; # normalize whitespace $_ = [split /[:\s]/, $_, 2]; }
    That's simpler to read, too.

    After Compline,
    Zaxo

      you missed the line '  fruits_vegs_food (1.0, pound);' which has no colon.
Re: Perl Problems with Matching Expression Patterns
by holli (Abbot) on Jan 12, 2005 at 21:33 UTC
    my @DATA = ( ' date : "April 27, 2004";', ' comment : "Copyright (c) 2002 FoodNation Technolo, Inc. ";', ' power_watts : "1pC";', ' fruits_vegs_food (1.0, pound);' ); my $inline; foreach $inline (@DATA) { my ($f, $s); if ( $inline =~ m/\s+(\w+)\s+:\s+"?([^"]+)"?/ ) { print "FIRST TERM - $1\n"; print "SECON TERM - $2\n"; } elsif ( $inline =~ m/\s+(\w+)\s+\(([^\)]+)\)/ ) { print "FIRST TERM - $1\n"; print "SECON TERM - $2\n"; } }
    hint: never, never use ".*" when you donīt need because ".*" matches also ""