dimmesdale has asked for the wisdom of the Perl Monks concerning the following question:

Hello. For my website, I have a section where I offer quotes. To make it easier to update and expand, I am changing it from HTML to perl. I have the quotes in several files, and they all follow the same format. My problem, however, is extracting the information I need and storing it properly. First, I'll show a brief snippet of the format.
/-aAesop /-sTyranny /-qAny excuse will serve a tyrant. /-sWise Sayings /-qIt is easy to be brave from a safe distance. /-sWise Sayings /-qIt is not only fine feathers that make fine birds. /-sWise Sayings /-qWe often despise what is most useful to us. /-sWise Sayings /-qNever soar aloft on an enemy's pinions. /-sWise Sayings /-qExample is the best precept. /-sWise Sayings /-qThinking to get at once all the gold the Goose could give, he kille +d it and opened it only to find,--nothing. /-aDante Alighieri /-sLiberty,Freedom /-qMankind is at its best when it is most free. This will be clear if +we grasp the principle of liberty. We must recall that the basic prin +ciple of liberty is freedom of choice, which saying many have on thei +r lips but few in their minds.
/-a represents the author.
/-s represents the subject.
/-q represents the quote.
I have some code to cycle through the files, get each chunk of the current file, then to retrieve the author. However my trouble is with getting the subject and quotes, and then in storing it. I wish to store the file in a way where I can retrieve the quotes with ease. For example, if someone wishes to look up all the quotes with the category "x", then something like the look-up call $subject{x} should contain data of all quotes of such a subject. What I have is below, all else I am troubled with:

my @files = qw(quotePAGE1 quotePAGE2); for @files { open IN, $_ or die "failed opening file $_, at"; $\ = "\n\n"; my $num = 0; my $chunk; while($chunk = <IN>) { ++$num; $chunk =~ s|^/-a(\S+)[^/n]*\n|| or die "no author? file $_, chunk $num; at"; while($chunk =~ s|/-s(\S+)[^\n]*\n/-q [end of code]

Replies are listed 'Best First'.
Re: Quotes page help
by chromatic (Archbishop) on Aug 13, 2001 at 22:34 UTC
    If I understand correctly, you have several records. Each record has two searchable criteria, the speaker name and a category. Furthermore, each category can contain multiple topics. You want to store this in a Perl data structure that will allow you to search on multiple criteria?

    You could store these as anonymous hashes in an array, and loop over them all to search. The category itself would have to contain an anonymous hash.

    You could store them all in a tied hash, keyed on some sort of unique identifier. You'd have to build up an index for each criterion, associating the unique identifier with all records in whatever type. MLDBM would be very helpful in this instance.

    In other words, it sounds like you're reinventing a relational database. Hie thee to MySQL or Postgres.

Re: Quotes page help
by Hofmator (Curate) on Aug 13, 2001 at 23:08 UTC

    I'd also go along with chromatic's suggestion of using a proper database.

    Your script could then look something like this:

    my $author; my $quote; my @subjects; while (<>) { next if /^\s*$/; # skip empty lines chomp; $author = substr($_,3), next if substr($_,0,3) eq '/-a +'; @subjects = split(/,/, substr($_,3)), next if substr($_,0,3) eq '/-s +'; $quote = substr($_,3) if substr($_,0,3) eq '/-q +'; insertDB($author,@subjects,$quote); }
    With a proper insertion subroutine insertDB depending on your database design.

    -- Hofmator