rsiedl has asked for the wisdom of the Perl Monks concerning the following question:

hi monks,

can anyone tell me how i could get the values that have been used for a split, in the order that they have been split?
Example:
#!/usr/bin/perl use strict; use warnings; my $blah = "reagen OR siedl OR smith AND jack"; my (@names) = split(/\sAND\s|\sOR\s/, $blah); print $_, "\n" foreach (@names); exit;
I would like to have the @names array as well as another array like:
@split_values = [ OR, OR, AND ];
cheers,
reagen

Update: If the PATTERN contains parentheses, additional array elements are created from each matching substring in the delimiter. should have read the man first, doh!

Replies are listed 'Best First'.
Re: using split
by brian_d_foy (Abbot) on Nov 17, 2006 at 08:09 UTC

    You can use the separator retention mode of split by having memory parens in the split pattern:

    my @names = split /\s+(AND|OR)\s+/, $blah;

    In @names, you'll get the separator as every other element, and it's up to you to figure out how to deal with that according to your task.

    Note that I adjusted your regex a bit. Since the whitespace are common to both, I put them outside of the alteration (those parens came in handy for grouping!), and I added the + one-or-more quantifier in case there are extra spaces next to each other.

    Good luck :)

    --
    brian d foy <brian@stonehenge.com>
    Subscribe to The Perl Review
Re: using split
by rinceWind (Monsignor) on Nov 17, 2006 at 11:24 UTC

    I would seriously recommend abandoning using split for what you are doing here. Although it's possible to use split to achieve your immediate needs, it will become hard to enhance the syntax, later on.

    What you are doing here is designing a grammar for boolean expressions. Have you thought about operator precedence? Which is higher, AND or OR? Can you parenthesise the expression if you want to override the precedence?

    Parse::RecDescent is your friend here. Grammars might appear frighening to the uninitiated at first, but they are not that difficult, and are very powerful and flexible.

    You could use regexp captures to pick up your tokens from the string, but before long, the regexp will be come unwieldy, almost as much as split. And you will have to do much processing afterwards to decode the results of the pattern match.

    My $0.02th for what it's worth

    --

    Oh Lord, won’t you burn me a Knoppix CD ?
    My friends all rate Windows, I must disagree.
    Your powers of persuasion will set them all free,
    So oh Lord, won’t you burn me a Knoppix CD ?
    (Missquoting Janis Joplin)

Re: using split
by jwkrahn (Abbot) on Nov 17, 2006 at 10:40 UTC
    You can get what you want like this:
    my $blah = 'reagen OR siedl OR smith AND jack'; my ( @names, @split_values ); push @{ /\A(?:OR|AND)\z/ ? \@split_values : \@names }, $_ for split ' +', $blah;
Re: using split
by smokemachine (Hermit) on Nov 17, 2006 at 11:42 UTC
    dont use split use this
    perl -e 'my $blah = "reagen OR siedl OR smith AND jack"; @names = spli +t /\s+( AND|OR)\s+/, $blah; @split_values = $blah =~ /\s+(AND|OR)\s+/g'