in reply to Searching module
All due deference to the learned ones, and keeping in mind that you don't want a database: use a "database."
Actually, look into using a tie()'d hash to several (Berkeley DB or similar) files: e.g.
tie %artist, DB_File, "$vardir/artists.db"; tie %album, DB_File, "$vardir/title.db"; tie %track, DB_File, "$vardir/track.db";
This means your indexer can do something like:
# tie to new files to keep from accidentally # re-using old values and to not update the db # while it may be being read by the search client my $vardir = "/var/music"; # whatever tie %artist, DB_File, "$vardir/.#artist.db#"; tie %album, DB_File, "$vardir/.#album.db#"; tie %track, DB_File, "$vardir/.#track.db#"; tie %by_id, DB_File, "$vardir/.#by_id.db#"; tie %keyword, DB_File, "$vardie/.#keyword.db#"; my $id = 0; open INDEX, "$vardir/my_ascii_index.csv" or die "can't index if I can't read the index: $!"; for my $line (<INDEX>) { my $this_artist, $this_album, @album_tracks = split /,/, $line; $artist{$this_artist} .= $id . ','; $album{$this_album} .= $id . ','; for my $this_track (@album_tracks) { $track{$this_track} .= $id . ','; } $by_id{$id} = join "\x00", $this_artist, $this_album, @album_tracks; for my $word (split /\s/, join (" ", $this_artist, $this_album, @album_tracks) ) { $keyword{$word} .= $id . ','; } $id++; } close INDEX; untie %album; untie %artist; untie %by_id; untie %album; untie %keyword; #[the data file assumed above would read like: # Pearl Jam,ten,Jeremy,Black,... # and could be created in Gnumeric or Excel as a CSV file]
That's really nasty, not to mention probably very inefficient, but could be easy to adapt to your particular inputs...
Then, to do a search query, do something like:
# use CGI and get your query words in whatever form # load them into e.g. $artist_query, $title_query, &c. my @result_ids = (); if ($artist{$artist_query}) { push @result_ids, $artist{$artist_query} } if ($track{$track_query}) { push @result_ids, $track{$track_query} } if ($album{$album_query}) { push @result_ids, $album{$album_query}; } for my $word (split /\s/, $keyword_query) { if ($keyword{$word}) { push @result_ids, $keyword{$word}; } } unless (@result_ids) { print "<h1> No results </h1>"; return; } print "<h1> Found " . (scalar @result_ids) . ": </h1> <ol type=1> "; for my $id (@result_ids) { my $artist, $album, @tracks = split /\x00/, $by_id{$id}; print "<li> <big> <a href=\"http://somewhere/interesting/lookup_id.pl?$id\">$album</a> </big> by $artist <br> <small> <ol type=1> "; for my $track (@tracks) { print " <li> $track </li>\n"; } print "</ol></li>\n\n"; } print "\n</ol>\n"; return;
Again, really nasty, but quick and simple. Does not allow any kind of search except by exact-match artist, track, or album, or by a keyword (which must be an exact match but can occur as any fragment of any field).
As eduardo pointed out, anything more complex, go ahead and use a 'real' search system. The only advantage to this structure is that it allows for an 'advanced search' or similar:
|
|
|---|