in reply to Re^2: pattern matching
in thread pattern matching

thanks bart for ur reply ur code is fine....but actually my problem of pattern matching includes 31(or less) instead of the 4(or less) pairs that as i mentioned in my post. i made the change so as to be able to give an example easily. if i follow ur code pattern then it might turn out to be a bit too long......any shorter version??? thanks again for ur help so far vineet

Replies are listed 'Best First'.
Re^4: pattern matching
by bart (Canon) on Dec 26, 2006 at 22:47 UTC
    Hmm I see what you mean, extending the code with more matches soon gets really unwieldy. You need some kind of looping construct.

    Now I wish I could say you could handle this easily with a single pattern, but unfortunately, a repetition modifier around captures doesn't produce the desired results:

    $_ = 'de ad be ef #junk'; /^(\w\w)(?: (\w\w))*/;
    will only retain two captures: in the end, $1 will be 'de', the first capture, and $2 will be 'ef', the last one — the rest will simple have been forgotten about.

    There's no way around it, this requires a two step approach: Step 1) extract the whole of all the captures, Step 2), split it into parts.

    1. The first approach is to use split for step 2:
      $_ = 'de ad be ef #junk'; /^(\w\w(?: \w\w)*)/; my @capture = split ' ', $1;
    2. Use //g, either in a loop, or in list context.
      1. //g in list context:
        $_ = 'de ad be ef #junk'; my @capture = /\G(?:^|\ )(\w\w)/g;
      2. A loop with //g in scalar context:
        $_ = 'de ad be ef #junk'; my @capture; while(/\G(?:^|\ )(\w\w)/g) { push @capture, $1; }

    Be extremely careful with the latter that you don't accidently cause an endless loop. I did, with

    /(?:^|\G\ )(\w\w)/g
    I'm still not sure why.