throop has asked for the wisdom of the Perl Monks concerning the following question:

Brethern, Given a string with:
* an optional dot.separated number at front
* a repeated set of n As and m Bs (n > 0, m >= 0), comma separated
I want to pull off, in a single list, the number, then each of the A and B fields. I.e.,
1.2.3AAABB,AAB,AA,AAABBB should give qw(1.2.3 AAA BB AA B AA AAA BBB) AABB,AAAB,ABBB should give qw(AA BB AAA B A BBB)
I've been doing it with a three step process - pull off the numbers, pull off the A and B's, clean up the nulls
use strict; sub chip{ local($_)=@_; if(/^(\d+ (?: \.\d+)* )? (.+) /x){ my($front) = $1; if(my(@rest) = $2 =~ /(A+) (B*) ,? /gx){ unshift(@rest, $front); grep {$_} @rest}}}
(I've omitted the extra code that does the error-checking etc.) I want to do a regular match on the numbers, then a greedy match on the rest. I don't think what I did is particularly clear or clean. Is there a way to do this in a single, comprehensible regex, without using something like grep to clean up the null strings?

throop

Replies are listed 'Best First'.
Re: g matching after a single regular match
by GrandFather (Saint) on Feb 01, 2007 at 21:25 UTC
    use strict; use warnings; my @strs = ('1.2.3AAABB,AAB,AA,AAABBB', 'AABB,AAAB,ABBB'); for (@strs) { my @parts = /((?:\d+\.?)+ | A+ | B+)/gx; print "qw(", join (' ', @parts), ")\n"; }

    Prints:

    qw(1.2.3 AAA BB AA B AA AAA BBB) qw(AA BB AAA B A BBB)

    Update: remove "while" sillyness


    DWIM is Perl's answer to Gödel
Re: g matching after a single regular match
by ikegami (Patriarch) on Feb 01, 2007 at 21:26 UTC

    If you want data extraction (as opposed to data validation)

    @list = /(\d+(?:\.\d+)*|A*|B*)/g;

    Update: The above doesn't work. Fix:

    @list = /(\d+(?:\.\d+)*|A+|B+)/g;
      You could also do the slightly simpler (and slightly more permissive):
      /([\d.]+|A+|B+)/g;

      Caution: Contents may have been coded under pressure.
Re: g matching after a single regular match
by AltBlue (Chaplain) on Feb 01, 2007 at 21:26 UTC
    sub chip { local $_ = shift; s/(?<=\d)(?=A)|,|(?<=A)(?=B)/ /g; return split / /; }
Re: g matching after a single regular match
by johngg (Canon) on Feb 01, 2007 at 21:28 UTC
    You should be able to do it with a global match with an alternation in the pattern.

    use strict; use warnings; my @strings = ( q{1.2.3AAABB,AAB,AA,AAABBB}, q{AABB,AAAB,ABBB}); foreach my $string ( @strings ) { print qq{$string\n}; my @elems = $string =~ m{ ( \d(?:\.\d)* | A+ | B+ ) }gx; print qq{ qw(@elems)\n}; }

    The output is

    1.2.3AAABB,AAB,AA,AAABBB qw(1.2.3 AAA BB AA B AA AAA BBB) AABB,AAAB,ABBB qw(AA BB AAA B A BBB)

    Cheers,

    JohnGG

Re: g matching after a single regular match
by ikegami (Patriarch) on Feb 01, 2007 at 21:43 UTC

    By the way, local $_; is dangerous (i.e buggy) in some circumstances (when the caller uses /\G.../ or pos($_) and when $_ is aliased to a tied variable). Use local *_ = ...; or for (...) instead.

    sub func { # Any changes to $_ will affect the caller's var. local *_ = \$_[0]; ... }
    sub func { my ($s) = $_[0]; # Safe. local *_ = \$s; ... }
    sub func { # Any changes to $_ will affect the caller's var. for ($_[0]) { ... } }
    sub func { for (my $s = $_[0]) { # Safe. ... } }