rojam74 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks-

I'm wondering if there is a way to match an arbitrary number of sub patterns with a regex, and have these subpatterns available in the $1, $2 variables?

ie: a string "aaa:bbb:ccc:ddd:eee:"
This string can be just 'aaa:', or might be many repetitions longer

How can I get 'aaa:' into $1, 'bbb:' into $2, 'ccc:' into $3, etc?

I know I can capture the whole pattern and split on the ':', but am curious if this is possible using ()'s in a regex. Thanks!
  • Comment on Matching and making accesible an arbitrary number of subpatterns w/a regex?

Replies are listed 'Best First'.
Re: Matching and making accesible an arbitrary number of subpatterns w/a regex?
by kwaping (Priest) on Aug 03, 2006 at 19:12 UTC
    This code is way overkill compared to doing a simple split, but anyway, I just wanted to see if I could do it.
    my $string = 'aaa:bbb:ccc:ddd:eee:'; my $chunk = '([^:]+:)'; my $colons =()= $string =~ /:/g; my $regex = $chunk x $colons; $string =~ /$regex/; print "$1,$2,$3,$4,$5";

    ---
    It's all fine and dandy until someone has to look at the code.
Re: Matching and making accesible an arbitrary number of subpatterns w/a regex?
by Velaki (Chaplain) on Aug 03, 2006 at 17:31 UTC

    Splitting on the colon might be your best bet if the nature of your data is field-delimited, but you should also be able to achieve the same with:

    my $src_string = 'aaa:bbb:ccc:ddd:eee:'; my @results = $src_string =~ /([^:]*):/g;

    Hope this helps,
    -v.

    "Perl. There is no substitute."

      That does not work when the string doesn't contain any colon:

      my $src_string = 'aaa'; my @results = $src_string =~ /([^:]*):/g; use Data::Dumper; print Dumper \@results; __END__ $VAR1 = [];

      --
      David Serrano

        It seems to me from the OP's description that it will always end in a semi-colon. If not, add it:
        my $src_string = 'aaa:bbb:ccc:ddd:eee'; my @results = "$src_string:" =~ /([^:]*):/g;
      Splitting on the colon, you say?
      my $src_string = 'aaa:bbb:ccc:ddd:eee:'; my @results = split /:/, $src_string; pop @results;

      Update: Oops, I missed the last line of the OP. Well, neither solution populates $1, $2, $3, so I guess neither answers the OP.

Re: Matching and making accesible an arbitrary number of subpatterns w/a regex?
by Hue-Bond (Priest) on Aug 03, 2006 at 17:33 UTC

    If you can cope with an array, something as simple as /([^:]+)+/g should do the trick.

    my $c='aaa:bbb'; my @arr = $c =~ /[^:]+/g; use Data::Dumper; print Dumper \@arr; __END__ $VAR1 = [ 'aaa', 'bbb' ]; --- my $c='aaa:bbb:foo:bar'; my @arr = $c =~ /[^:]+/g; use Data::Dumper; print Dumper \@arr; __END__ $VAR1 = [ 'aaa', 'bbb', 'foo', 'bar' ];

    Update: Simplified the regex, as per Sidhekin suggestion.

    --
    David Serrano

      When given 'aaa::bbb', your regexp only returns two items.
Re: Matching and making accesible an arbitrary number of subpatterns w/a regex?
by Skeeve (Parson) on Aug 03, 2006 at 22:45 UTC

    There was yet only one (kwaping) who correctly answered the OP's question.

    Here is my solution to it:

    my $string = 'aaa:bbb:ccc:ddd:eee:'; my $regex= $string; $regex=~ s/[^:]*:/([^:]*:)/g; $string =~ /$regex/; print "$1,$2,$3,$4,$5";


    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
Re: Matching and making accesible an arbitrary number of subpatterns w/a regex?
by sh1tn (Priest) on Aug 03, 2006 at 21:08 UTC
    non-greedy:
    $_ = "aaa:bbb:ccc:ddd:eee:"; @_ = $_ =~ /(.+?:)/g; print "@_"