in reply to Regexing my Search Terms

Here's a cleaner, less-buggy version of your code to get you going. You'll want to do the usual parameter sanitization, depending on your specific needs.
use strict; my (@Keys,@KeysNeed,@KeysAvoid); my $keywords = qq(word +"my phrase" -"my phrase's mom" +other keywords +); while ($keywords) { $keywords =~ s/^\s*([+-]*)(((["'])(.*?)\4)|\S+)\s*//; my $cat = $1; my $keyword = $2; $keyword =~ s/^[^\w\s]+//g; # Strip special chars from the ends $keyword =~ s/[^\w\s]+$//g; next unless $keyword; if ($cat =~ /\+/) { push @KeysNeed, $keyword; } elsif ($cat =~ /\-/) { push @KeysAvoid, $keyword; } else { push @Keys, $keyword; } } print "KEYS NEEDED: @KeysNeed\n"; print "KEYS TO AVOID: @KeysAvoid\n"; print "KEYS: @Keys\n"; # prints: # KEYS NEEDED: my phrase other # KEYS TO AVOID: my phrase's mom # KEYS: word keywords
Here's a quick explanation of my changes to your regexp:
s/^\s*([+-]*)(((["'])(.*?)\4)|\S+)\s*//
• Pull keywords directly from the start of the var.
• Ignore leading whitespace.
• Pull any number of +- signs. (Note: This is odd. If they have +-+-+ before a keyword, it'll pull it. The code above considers the + operator to be of highest precedence)
• Next is the fun one -- If it starts with a quote char, pull everything until you reach the next *matching* quote char. In your code, you look for both chars at the beginning and end, so "my phrase's mom" would get broken up like "my phrase', which is not what you would want.
• If there is no quote char, pull all chars until you hit a space or the end of the string.