All due deference to the learned ones, and keeping in mind that you don't want a database: use a "database."
Actually, look into using a tie()'d hash to several (Berkeley DB or similar) files: e.g.
tie %artist, DB_File, "$vardir/artists.db";
tie %album, DB_File, "$vardir/title.db";
tie %track, DB_File, "$vardir/track.db";
This means your indexer can do something like:
# tie to new files to keep from accidentally
# re-using old values and to not update the db
# while it may be being read by the search client
my $vardir = "/var/music"; # whatever
tie %artist, DB_File, "$vardir/.#artist.db#";
tie %album, DB_File, "$vardir/.#album.db#";
tie %track, DB_File, "$vardir/.#track.db#";
tie %by_id, DB_File, "$vardir/.#by_id.db#";
tie %keyword, DB_File, "$vardie/.#keyword.db#";
my $id = 0;
open INDEX, "$vardir/my_ascii_index.csv" or
die "can't index if I can't read the index: $!";
for my $line (<INDEX>)
{
my $this_artist, $this_album, @album_tracks = split /,/, $line;
$artist{$this_artist} .= $id . ',';
$album{$this_album} .= $id . ',';
for my $this_track (@album_tracks)
{
$track{$this_track} .= $id . ',';
}
$by_id{$id} = join "\x00", $this_artist, $this_album,
@album_tracks;
for my $word (split /\s/, join (" ", $this_artist,
$this_album, @album_tracks) )
{
$keyword{$word} .= $id . ',';
}
$id++;
}
close INDEX;
untie %album;
untie %artist;
untie %by_id;
untie %album;
untie %keyword;
#[the data file assumed above would read like:
# Pearl Jam,ten,Jeremy,Black,...
# and could be created in Gnumeric or Excel as a CSV file]
That's really nasty, not to mention probably
very inefficient, but could be easy to adapt to your
particular inputs...
Then, to do a search query, do something like:
# use CGI and get your query words in whatever form
# load them into e.g. $artist_query, $title_query, &c.
my @result_ids = ();
if ($artist{$artist_query}) {
push @result_ids, $artist{$artist_query}
}
if ($track{$track_query}) {
push @result_ids, $track{$track_query}
}
if ($album{$album_query}) {
push @result_ids, $album{$album_query};
}
for my $word (split /\s/, $keyword_query)
{
if ($keyword{$word}) {
push @result_ids, $keyword{$word};
}
}
unless (@result_ids) {
print "<h1> No results </h1>"; return;
}
print "<h1> Found " . (scalar @result_ids) . ": </h1>
<ol type=1>
";
for my $id (@result_ids)
{
my $artist, $album, @tracks = split /\x00/, $by_id{$id};
print "<li> <big>
<a
href=\"http://somewhere/interesting/lookup_id.pl?$id\">$album</a>
</big> by $artist <br>
<small> <ol type=1>
";
for my $track (@tracks)
{
print " <li> $track </li>\n";
}
print "</ol></li>\n\n";
}
print "\n</ol>\n";
return;
Again, really nasty, but quick and simple. Does not
allow any kind of search except by exact-match artist,
track, or album, or by a keyword (which must be an exact
match but can occur as any fragment of any field).
As eduardo pointed out, anything more complex,
go ahead and use a 'real' search system. The only advantage
to this structure is that it allows for an 'advanced
search' or similar:
Enter keyword(s): ________
<menu> Advanced search:
- Artist (exact name): ______
- Album Title (exact name): ______
- Track Title (exact name): ______
</menu>
Submit
|
|