in reply to Re^2: Sort of basic search engine/pattern matching problems
in thread Sort of basic search engine/pattern matching problems
Unfortunately, due to the large amount of data I have, I cannot understand if it's working...
How many lines in "query04.txt"? How many lines in "multisearch_final_sorted_3.txt"? If you create a test version of each file, containing just a few lines that should produce some output, does the script work correctly on those test files? (Hint: Allowing file names to be provided as command line args can help with testing.)
One way to try speeding things up is to create a single regex from your query file, by joining the lines with "|":
(In addition to allowing for other input files and using a single regex to check all matches, I also left out the "debug" stuff, rearranged the output format a little, and changed the "open" statements to use the 3-arg style.) UPDATED to add "or die ..." on each of the "open" statements -- that should be a habit.#!/usr/bin/perl use strict; use warnings; my $PATHDATA = "."; # (you didn't say how this was being set) my ( $query_list_file, $file_to_search ) = ( @ARGV == 2 ) ? @ARGV : ( "$PATHDATA/query04.txt", "$PATHDATA/multisearch_final_sorted_3.t +xt" ); open( HH, "<", "$PATHDATA/$query_list_file") or die "$PATHDATA/$query_ +list_file: $!\n"; chomp( my @query_arr = <HH> ); close HH; my $query_regex = join( '|', @query_arr ); open(XX, "<", "$PATHDATA/$file_to_search") or die "$PATHDATA/$file_to_ +search: $!\n"; open(DD, ">", "$PATHDATA/query_results.txt") or die "$PATHDATA/query_r +esults.txt: $!\n"; my $count=0; while ( <XX> ) { if ( /^($query_regex)\]/ ) { print DD "############\nQUERY: $1\nMATCH: $_\n"; $count++; } } print DD "Entrate trovate: $count\n";
If you still have a problem when using some small test files, post a complete and runnable script (like the one shown here) with the test data.
|
|---|