Hi all, I found this bit of code and got all excited:
sub bool_to_regexp { local($query) = @_; # Sanity check: query must not contain unescaped "/"! $query =~ s|^/|\\/|; $query =~ s|([^\\])/|\\/|g; # boolean expression (or single word) local($not, $join); #local($qrycmd) = "next unless ("; local($qrycmd) = ""; # $query =~ tr/A-Z/a-z/; $query =~ s/\(/ ( /g; # make sure brackets are separate "wor +ds" $query =~ s/\)/ ) /g; $query =~ s/["'`]//g; # quotes don't work (yet?) # for (split(/[+ \t]+/, $query)) { # Splitting on "+" is bad for queries like "C++"! $query =~ s/\+/\\+/g; for (split(/[ \t]+/, $query)) { # for each "word", do ... next if /^$/; if ($_ eq ")") { $qrycmd .= $_; next; } if ($_ eq "(") { $qrycmd .= "$join$_"; $join = ""; nex +t; } if ($_ eq "NOT") { $not = "!"; next; } if ($_ eq "OR") { $join = " || "; next; } if ($_ eq "AND") { $join = " && "; next; } if (/\*/) { s/\*/\\w*/g; } # match word boundaries only if fully alphanumeric # (for queries like "C++") elsif (/^\w+$/) { s/(.*)/\\b\1\\b/; } $qrycmd .= "$join $not/$_/$caseSensitivity"; $join = " && "; # default to AND joins $not = ""; } $qrycmd .= ""; }
What i'd like to do is use the function to have a command which scans through all the lines in a text file and returns the lines that match the boolean condition I've entered as an argument. So using the function above I have:
my $req = bool_to_regexp($ARGV[0]); print "using $req ...\n"; open (FILE, "stuff") || die "$!"; while(<FILE>) { if ($req) { print $_; } } close FILE;
but the if ($req) doesn't work (just prints everything), so I assumed as the $req contains the regexp, I'd need to eval it, like so:
my $req = bool_to_regexp($ARGV[0]); print "using $req ...\n"; open (FILE, "stuff") || die "$!"; while(<FILE>) { eval if ($req) { print $_; } } close FILE;
but I get a syntax error. How do I make it so that doing:

myprog.pl "fish AND (chips OR beans)"

will return the lines in "stuff" with the words "fish" and either "chips" or "beans"?

Many thanks

ps. I know the sub works just fine, its the eval which I think is the problem


In reply to Boolean search by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.