in reply to Implementing a text filter on some dataset
Cos the problem intrigued me :)
#! perl -slw use strict; ## Convert query syntax to evalable Perl code sub buildQuery { local $_ = uc shift; while( m[ \( (?: AND | OR ) | != | = ]x ) { s< ( [^!=()\s]+ ) ( != | = ) ( [^()\s]+ ) >{ my $op = ( $2 eq '!=' ? 'ne' : 'eq' ); "[\$\L$1\E $op '$3']"; }xge; s< \( \s* ( AND ) \s+ ( \[ [^()]+ \] ) \s+ ( \[ [^()]+ \] ) \s +* \) > { "[$2 && $3]" }xge; s< \( \s* ( OR ) \s+ ( \[ [^()]+ \] ) \s+ ( \[ [^()]+ \] ) \s* + \) > { "[$2 || $3]" }xge; } tr/[]/()/; return $_; } ## Read data and UPPER case my $data = do{ local $/; uc <DATA> }; ## Some variables my( $author, $profit, $publisher, $book ); ## And a regex to populate them from each record my $re = qr[ PAGE \s+ \d+ \s+ AUTHOR: \s+ ( [^\n]+ ) (?{ $author = $^N }) \s+ PROFIT: \s+ ( [^\n]+ ) (?{ $profit = $^N }) \s+ PUBLISHER: \s+ ( [^\n]+ ) (?{ $publisher = $^N }) \s+ BOOK: \s+ ( [^\n]+ ) (?{ $book = $^N }) \s+ ]x; ## Covert the query NOTE: (AND ) syntax is required ## where example used implicit AND my $query = buildQuery( <<EOQ ); (AND (OR (AND AUTHOR=John PROFIT=90% ) (AND AUTHOR=Matt PROFIT=80% ) ) PUBLISHER=OReilly ) EOQ print "\nQuery: $query\n"; ## Test the condition and print the record if it matches ## For each record eval "$query" and print $1 while $data =~ m[ ( $re ) ]xg; ## Same again for another query ## Note != also accepted. my $query2 = buildQuery( <<EOQ ); (OR (AND AUTHOR=John PROFIT!=90% ) (AND AUTHOR=Matt PUBLISHER!=OReilly ) ) EOQ print "\nQuery: $query2\n"; eval "$query2" and print $1 while $data =~ m[ ( $re ) ]xg; __DATA__ Page 1 AUTHOR: John PROFIT: 20% PUBLISHER: TMH BOOK: OPERATING SYSTEMS Page 2 AUTHOR: John PROFIT: 90% PUBLISHER: OREILLY BOOK: ALGORITHMS Page 3 AUTHOR: Matt PROFIT: 80% PUBLISHER: TMH BOOK: COMPUTER NETWORKS Page 4 AUTHOR: Matt PROFIT: 80% PUBLISHER: OREILLY BOOK: COMMUNICATION SYSTEMS
Outputs:
[ 9:25:03.05]C:\test>670477 Query: (((($author eq 'JOHN') && ($profit eq '90%')) || (($author eq ' +MATT') && ($profit eq '80%'))) && ($publisher eq 'OREILLY')) PAGE 2 AUTHOR: JOHN PROFIT: 90% PUBLISHER: OREILLY BOOK: ALGORITHMS PAGE 4 AUTHOR: MATT PROFIT: 80% PUBLISHER: OREILLY BOOK: COMMUNICATION SYSTEMS Query: ((($author eq 'JOHN') && ($profit ne '90%')) || (($author eq 'M +ATT') && ($publisher ne 'OREILLY'))) PAGE 1 AUTHOR: JOHN PROFIT: 20% PUBLISHER: TMH BOOK: OPERATING SYSTEMS PAGE 3 AUTHOR: MATT PROFIT: 80% PUBLISHER: TMH BOOK: COMPUTER NETWORKS
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Implementing a text filter on some dataset
by grizzley (Chaplain) on Feb 27, 2008 at 13:06 UTC |