Re: Searching module

All due deference to the learned ones, and keeping in mind that you don't want a database: use a "database."

Actually, look into using a tie()'d hash to several (Berkeley DB or similar) files: e.g.

 tie %artist, DB_File, "$vardir/artists.db";
 tie %album, DB_File, "$vardir/title.db";
 tie %track, DB_File, "$vardir/track.db";
[download]

This means your indexer can do something like:

 # tie to new files to keep from accidentally
 # re-using old values and to not update the db
 # while it may be being read by the search client

 my $vardir = "/var/music"; # whatever

 tie %artist, DB_File, "$vardir/.#artist.db#";
 tie %album, DB_File, "$vardir/.#album.db#";
 tie %track, DB_File, "$vardir/.#track.db#";
 tie %by_id, DB_File, "$vardir/.#by_id.db#";
 tie %keyword, DB_File, "$vardie/.#keyword.db#";

 my $id = 0;

 open INDEX, "$vardir/my_ascii_index.csv" or
   die "can't index if I can't read the index: $!";

 for my $line (<INDEX>)
 {
   my $this_artist, $this_album, @album_tracks = split /,/, $line;
   $artist{$this_artist} .= $id . ',';
   $album{$this_album} .= $id . ',';
   for my $this_track (@album_tracks)
   {
     $track{$this_track} .= $id . ',';
   }
   $by_id{$id} = join "\x00", $this_artist, $this_album,
     @album_tracks;
   for my $word (split /\s/, join (" ", $this_artist,
     $this_album, @album_tracks) )
   {
     $keyword{$word} .= $id . ',';
   }
   $id++;
 }

 close INDEX;
 untie %album;
 untie %artist;
 untie %by_id;
 untie %album;
 untie %keyword;

#[the data file assumed above would read like:
# Pearl Jam,ten,Jeremy,Black,... 
# and could be created in Gnumeric or Excel as a CSV file]
[download]

That's really nasty, not to mention probably very inefficient, but could be easy to adapt to your particular inputs...

Then, to do a search query, do something like:


# use CGI and get your query words in whatever form
# load them into e.g. $artist_query, $title_query, &c.

my @result_ids = ();

if ($artist{$artist_query}) { 
 push @result_ids, $artist{$artist_query} 
}
if ($track{$track_query}) { 
 push @result_ids, $track{$track_query} 
}
if ($album{$album_query}) {
 push @result_ids, $album{$album_query};
}
for my $word (split /\s/, $keyword_query)
{
 if ($keyword{$word}) {
  push @result_ids, $keyword{$word};
 }
}
unless (@result_ids) { 
 print "<h1> No results </h1>"; return;
}
print "<h1> Found " . (scalar @result_ids) . ": </h1>
<ol type=1>
";
for my $id (@result_ids)
{
 my $artist, $album, @tracks = split /\x00/, $by_id{$id};
 print "<li> <big>
 <a 
href=\"http://somewhere/interesting/lookup_id.pl?$id\">$album</a>
 </big> by $artist <br>
 <small> <ol type=1>
";
 for my $track (@tracks)
 {
   print " <li> $track </li>\n";
 }
 print "</ol></li>\n\n";
}
print "\n</ol>\n";
return;
[download]

Again, really nasty, but quick and simple. Does not allow any kind of search except by exact-match artist, track, or album, or by a keyword (which must be an exact match but can occur as any fragment of any field).

As eduardo pointed out, anything more complex, go ahead and use a 'real' search system. The only advantage to this structure is that it allows for an 'advanced search' or similar:

Enter keyword(s): ________

<menu> Advanced search:

Artist (exact name): ______
Album Title (exact name): ______
Track Title (exact name): ______

</menu> Submit

Comment on Re: Searching module Select or Download Code