in reply to Pattern match for split() - need match but not match syntax

As long as you have to think about regexes anyway, you can use a regex that parses your text at the same time that it's splitting it.

my $LINE_PATTERN = qr{ ([^:]+) # capture everything before ... :\s* # the colon and any newline or other whitespace, ([^\n]+) # then capture everything before \n # the next newline }msx; my $app_text = "one:partridge\ntwo:\nturtle doves\nthree:french hens\n +"; while ($app_text =~ /$LINE_PATTERN/g) { print "$1: $2\n"; }

If you were planning to put the fields in a hash, you can do it all at once:

my %value_of = $app_text =~ /$LINE_PATTERN/g; while (my ($field, $value) = each %value_of) { print "$field: $value\n"; }

Replies are listed 'Best First'.
Re^2: Pattern match for split() - need match but not match syntax
by johngg (Canon) on May 06, 2008 at 18:15 UTC
    While there's nothing wrong with your $LINE_PATTERN regex I think it would be simpler to keep the record and field/value processing separate. To my eye it looks tidier and easier to maintain but others may disagree.

    use strict; use warnings; use Data::Dumper; my $app_text = qq{one:partridge\ntwo:\nturtle doves\nthree:french hens\n}; my %fvPairs = map { split m{:\n?} } map { split m{(?<!:)\n} } $app_text; print Data::Dumper->Dumpxs( [ \ %fvPairs], [ q{*fvPairs} ] );

    produces ...

    %fvPairs = ( 'three' => 'french hens', 'one' => 'partridge', 'two' => 'turtle doves' );

    Cheers,

    JohnGG