in reply to Re: Search string giving incorrect results
in thread Search string giving incorrect results

Ok, here's (I hope) a little clarification. here's a big snipet of code
my @dat_files = <$board_dir/*.dat>; $q = 0; foreach (@dat_files) { $number; $number = $_; $number =~ s/\/var\/www\/cgi-bin\/2930forum\/data\///g; $number =~ s/\.dat//g; open THREAD, "$_" or die "Can't open .dat file: $!"; $x = 0; while (<THREAD>) { $thread_data[$x] = $_; $x++; } close THREAD; foreach (@thread_data) { @details = split /\|/, $_; if ($details[4] =~ m/\Q$in{for}\E/i) { $found[$q] = $number; $q++; } } }
It's a little sploppy at this point, but I'm just trying to get valid results at this point, I'll clean it up later.

$in{for}: is defined my form input (I've been using simple searches like "cheese")
$thread_data[4]: is the messages posted in every thread. I could potentially contain just about anything except empty.
$thread_data[0]: I just realized isn't used. Insted it's $number. Which is just a number string denoting the thread number

I do the search and get results. Some of the threads contain the searchword $in{for} and others do not. For example I put in $in{for} = cheese and get around 30 results containing messages like:
Hard work pays off after time, but lazyness always pays off now.

One thing I just noticed is that many of the results are in numerical order. for example I get results like 121, 122, 124, 125, 126, 127, 128, 129, 21, 282 ,321, 343, 344, 345, etc

Replies are listed 'Best First'.
Re: Re: Re: Search string giving incorrect results
by dws (Chancellor) on Apr 17, 2002 at 21:00 UTC
    This code has a couple of problems. One is that the handling of filenames, particularly the method of extracting a number from them, is highly suspect. Once you think you've extracted $number, try printing both the full filename and $number.

    It looks like you're trying to accumulate a list of thread nubmers that contain matches. Since you say that $thread_data[0] is the same as $number, this might be easier like so:

    my @dat_files = <$bboard/*.dat>; my %found = (); foreach my $file ( @dat_files ) { open(DAT, $file) or die "$file: $!"; while ( <DAT> ) { my @thread_data = split "|"; if ( $thread_data[4] =~ m/\Q$in{$for}\E/i ) { $found{$thread_data[0]}++; } } close(DAT); }
    The keys of %found are now the thread numbers taht contain a match, and the corresponding values are the number of matches.