iradik has asked for the wisdom of the Perl Monks concerning the following question:

i am reposting my question because i asked it very poorly. let me try again
imagine that i have a repeating string:
foo(stuff)tarbarfoo(stuff)barfoo(stuff)tarbar
where (stuff) is just random crap which changes on each repeat and where the alternation of tarbar vs bar is random.
suppose i want to extract each (stuff) from the string where stuff is only in the form foo(stuff)tarbar.
the example code, which is wrong, is the best i can do:
my @array = ($string =~ /foo(.*?)tarbar/gso);
however,this code goes wrong when foo(stuff)bar is found in the pattern because it matches foo(stuff)bar as "foo(stuffbarfoostuff)tarbar". in other words, it loads (stuff)barfoo(stuff) into the list but.. i only want stuffs from foo(stuff)tarbar
?help?

Replies are listed 'Best First'.
Re: repeating patterns 2
by Dominus (Parson) on Mar 30, 2001 at 20:31 UTC
    I wonder if it wouldn't be best to do something like this:
    @foostuffs = split /(?:tar)?bar/, $string;
    This should discard all the bar and tarbar. It will look for tarbar first and discard that if it finds it, and look for bar otherwise.

    Then you just need to remove the foo from the items in @foostuffs:

    for (@foostuffs) { s/^foo// }
      and as is being mentioned in the chatterbox, foo may not be present within the item :
      for (@foostuffs) { s/^foo// }
      should then be
      for (@foostuffs) { s/^foo// unless (/foo/) {process data here} }
      but I don't think that was mentioned above.
(tye)Re: repeating patterns 2
by tye (Sage) on Mar 30, 2001 at 20:33 UTC
    while( /foo(.*?)(tar)?bar/g ) { next unless $2; ... }

    There might be a way to do something very tricky with zero-width assertions, but I find going the simple route is often much easier to maintain.

    Updated the above to be simpler and added the following. The above returns "XfooY" for "fooXfooYtarbar" when just "Y" was wanted. So a simple state machine:

    my( $state, $prev, @match )= ( "no foo" ); foreach my $tok ( split /(foo|tarbar|bar)/, $string ) { if( "foo" eq $tok ) { $state= "foo"; next; } elsif( "tarbar" eq $tok ) { if( "other" eq $state ) { push @match, $prev; # and/or do other stuff here } } elsif( "bar" ne $tok && "foo" eq $state ) { $prev= $tok; $state= "other"; next; } $state= "no foo"; }

            - tye (but my friends call me "Tye")
Re: repeating patterns 2
by arturo (Vicar) on Mar 30, 2001 at 20:43 UTC

    Just to immortalize my chatterbox fumblings, let me suggest this:

    my @matches; foreach ( split "foo", $string) { push @matches, $1 if /(.*)tarbar/; }

    grabs all the (stuff)s, each of which is guaranteed not to contain "foo". If you don't want "tarbar"s embedded in the (stuff)s you get, make that a non-greedy match.

    Philosophy can be made out of anything. Or less -- Jerry A. Fodor

Re: repeating patterns 2
by danger (Priest) on Mar 30, 2001 at 20:52 UTC

    If I understand your question correctly, then you'd only want 'stuff1' and 'stuff3' as in the following:

    $_ = 'foofoostuff1tarbarfoostuff2barfoostuff3tarbar'; my @a = /foo((?:(?!bar|foo).)*?)tarbar/gs; print "@a\n";

    But realize, this rules out getting any 'stuff' that might hold 'bar' or 'foo' in it and the string 'foostuffbartarbar' would *not* give you a match, nor would 'foofoobartarbar', even though there is "stuff" between a 'foo' and a 'tarbar'.

Re: repeating patterns 2
by mbond (Beadle) on Mar 30, 2001 at 22:15 UTC
    couldn't you just pattern match it using

    something =~ /\bwholeword\bstuff/foobar/

    I'm not on a machine i can play with right now .. so its just a guess ...

    mbond