in reply to Search string giving incorrect results

So my question is, what am I getting these erronious results?

That's hard to say without seeing some representative data. Show us a value for $in{for}, and @thread[0,4] pair that you expect to match (but doesn't), and a pair that does match that you expect should not.

  • Comment on Re: Search string giving incorrect results

Replies are listed 'Best First'.
Re: Re: Search string giving incorrect results
by Bishma (Beadle) on Apr 17, 2002 at 19:06 UTC
    Ok, here's (I hope) a little clarification. here's a big snipet of code
    my @dat_files = <$board_dir/*.dat>; $q = 0; foreach (@dat_files) { $number; $number = $_; $number =~ s/\/var\/www\/cgi-bin\/2930forum\/data\///g; $number =~ s/\.dat//g; open THREAD, "$_" or die "Can't open .dat file: $!"; $x = 0; while (<THREAD>) { $thread_data[$x] = $_; $x++; } close THREAD; foreach (@thread_data) { @details = split /\|/, $_; if ($details[4] =~ m/\Q$in{for}\E/i) { $found[$q] = $number; $q++; } } }
    It's a little sploppy at this point, but I'm just trying to get valid results at this point, I'll clean it up later.

    $in{for}: is defined my form input (I've been using simple searches like "cheese")
    $thread_data[4]: is the messages posted in every thread. I could potentially contain just about anything except empty.
    $thread_data[0]: I just realized isn't used. Insted it's $number. Which is just a number string denoting the thread number

    I do the search and get results. Some of the threads contain the searchword $in{for} and others do not. For example I put in $in{for} = cheese and get around 30 results containing messages like:
    Hard work pays off after time, but lazyness always pays off now.

    One thing I just noticed is that many of the results are in numerical order. for example I get results like 121, 122, 124, 125, 126, 127, 128, 129, 21, 282 ,321, 343, 344, 345, etc
      This code has a couple of problems. One is that the handling of filenames, particularly the method of extracting a number from them, is highly suspect. Once you think you've extracted $number, try printing both the full filename and $number.

      It looks like you're trying to accumulate a list of thread nubmers that contain matches. Since you say that $thread_data[0] is the same as $number, this might be easier like so:

      my @dat_files = <$bboard/*.dat>; my %found = (); foreach my $file ( @dat_files ) { open(DAT, $file) or die "$file: $!"; while ( <DAT> ) { my @thread_data = split "|"; if ( $thread_data[4] =~ m/\Q$in{$for}\E/i ) { $found{$thread_data[0]}++; } } close(DAT); }
      The keys of %found are now the thread numbers taht contain a match, and the corresponding values are the number of matches.