Re: Searching data file

I have rewritten your code with a simpler algorithm.

use strict;
use Data::Dumper;

my %records; # hash to store book info based on author
my $data;
while (<DATA>)
{
    chomp;
    $data .= $_;
    if ($_ eq '</ref>') {
        # process what's in the buffer when we see the end tag
        my $rec = process_record($data);
        $records{$rec->{author}} = $rec;
        $data = '';
    }
}

print print Dumper(\%records);

sub process_record
{
    my $rec = shift;
    my %col;

    ($col{author})  = $rec =~ m/<author>\s*([^<]*)(?=<)/g;
    ($col{year})    = $rec =~ m/<year>\s*([^<]*)(?=<)/g;
    ($col{source})  = $rec =~ m/<source>\s*([^<]*)(?=<)/g;
    ($col{id})      = $rec =~ m/<id>\s*([^<]*)(?=<)/g;
    ($col{title})   = $rec =~ m/<title>\s*([^<]*)(?=<)/g;
    my @keywords    = $rec =~ m/<key>\s*([^<]*)(?=<)/g;
    $col{keywords}  = \@keywords;

    return \%col;
}


__DATA__
<ref>
<provnc>
<aulist>
<author> Bin Laden
</aulist>
<year>1990
<source> Cambridge University Press, Cambridge UK, 1st edition 
<id>1
<keywords>
<key>terrorism
<key>whatever
</keywords>
</provnc>
<title> Terrorism
</ref>

<ref>
<provnc>
<aulist>
<author> Sydney
</aulist>
<year>1990
<source> Cambridge University Press, Cambridge UK, 1st edition 
<id>1
<keywords>
<key>nothing
<key>whatever
</keywords>
</provnc>
<title> Terrorism
</ref>
[download]

And the output is as expected -

$VAR1 = {
          'Bin Laden' => {
                           'title' => 'Terrorism',
                           'author' => 'Bin Laden',
                           'keywords' => [
                                           'terrorism',
                                           'whatever'
                                         ],
                           'id' => '1',
                           'year' => '1990',
                           'source' => 'Cambridge University Press, Ca
+mbridge UK, 1st edition '
                         },
          'Sydney' => {
                        'title' => 'Terrorism',
                        'author' => 'Sydney',
                        'keywords' => [
                                        'nothing',
                                        'whatever'
                                      ],
                        'id' => '1',
                        'year' => '1990',
                        'source' => 'Cambridge University Press, Cambr
+idge UK, 1st edition '
                      }
        };
[download]

Comment on Re: Searching data file Select or Download Code

Replies are listed 'Best First'.
Re: Re: Searching data file by graff (Chancellor) on Nov 03, 2003 at 02:24 UTC
Actually, I think there's a slight problem with this design. The markup structure makes it clear that it is meant to handle refs with multiple authors, and when there is such a ref entry, your "process_record" sub will only return the first author -- then this single author will be the basis for testing if the record matches the given search. So if the name being searched for happens to be the second author in a record, that record won't be returned. You would need the hash element for "author" be a reference to an array, and then search over the elements of that array, which makes it a lot more complicated than if you were reading a whole `<ref>...</ref>` element at each iteration (by setting $/ as I suggested above), and looking for $search anywhere within the `<authlist>` element.	[reply] [d/l] [select]
Re: Re: Re: Searching data file by Roger (Parson) on Nov 03, 2003 at 02:44 UTC
Yes you are right, there is a problem that my code does not pick out multiple authors. I have omitted multiple authors for being lazy. Fixing the code is simple though, just modify the code slightly to read multiple authors (same as multiple keys). `my @author = $rec =~ m/<key>\s([^<])(?=<)/g; $col{author} = \@author;` [download] And how to store the returned hash structure by the subroutine needs to be revised too since there can be multiple authors. That should be a simple exercise.	[reply] [d/l]
Re: Re: Re: Re: Searching data file by Anonymous Monk on Nov 03, 2003 at 09:32 UTC
If you wanted to use the hash again in another program (ie. save the data as a hash in a file for future use) -- how do you get the data to be 'loaded' in the new program so that you could use the statement below? $myinfo = $records{'Sydney'}{'Title'}; I looked on the web and everyone is using the 'eval' function but there are no good examples.	[reply]