lisaw has asked for the wisdom of the Perl Monks concerning the following question:

Dear PerlMonks, Here's what I'm trying to accomplish: I have a perl script that accesses a flat file db that simply lists a company name($company), company url($url), and company directory($directory). I want to be able to open a second flat file db that contains the company name and a subdirectory link and list it below the matching company. Like this:
Company 1
   -Company 1 sub link 1
   -Company 1 sub link 2
Company 2
   -Company 2 sub link 1
Here's my code:
{ &directories; } sub directories { $database="company_file.dat"; open(FILE, $database) || die "can't open"; @list = <FILE>; close(FILE); $numlist = @list; print "Content-type: text/html\n\n"; print <<"EOF"; Company Listings: EOF for ($a = 0; $a < $numlist; $a ++) { ($company, $url, $directory) = split(/\|/, $list[$a]); print "$company, $url, $directory<BR>\n"; } close(FILE); } 1;
My subdirectory flat file db contains:
$company, $sub-url, $sub-directory, $sub-maindir
Any suggestions...or links to existing examples would be greatly appreciated... lis

Replies are listed 'Best First'.
Re: Flat File Question
by hmerrill (Friar) on Oct 24, 2003 at 18:58 UTC
    Two options I can think of:
    1. if the two files are relatively small, you can read them both into hashes, where the key is the company name (assuming the company name is what you want to match on) and the value is the whole pipe delimited record. Loop through the 1st hash, like you did in the "for" above, and for each hash element, check for a matching company name in the other hash. This would be the fastest (performance) approach. 2. loop through the records in file 2 - create a hash where the key is the company name, and the value is the whole pipe delimited record. After that loop is done and the hash for file 2 has been created, loop through file 1 records - for each record, see if there is a matching company name in the file 2 hash.
    If the files are too big, the last resort might be to have an outer loop to loop through the file 1 records, and an inner loop that for each file 1 record, will loop through all the records in file 2, looking for a company name match.

    HTH.
      Hi HMerrill, Is there any way that you could provide an example of the first option? I'm still learning :) Thank you, lis

        I'm still learning too :-)

        Here's some code to

        1. loop through file 2 lines, and create a hash for each line where the key is the company name, and the value is the whole pipe delimited record. 2. loop through file 1 lines - for each file 1 line, check for a match in the file 2 hash. #!/usr/bin/perl -w use strict; my ($company, $url, $directory); ### 1. loop through file 2, building a hash where the ### key = company name ### value = whole pipe delimited record my $file2="/path/to/company_file2.dat"; open(FILE2, "<$file2") || die "Can't open $file2: $!"; my %file2_hash = (); while (<FILE2>) { my $line = chomp($_); ($company, $url, $dir) = split(/\|/, $line); ### This next statement creates entries in hash ### %file2_hash. ### - the key in that hash is $company ### - the value in that hash is a *reference* ### to an anonymous hash - the curly braces on the ### right of the equals sign create a reference ### to an anonymous hash. The hash that the ### reference points to has 2 keys - one key is ### "url", and the other key is "dir". $file2_hash{$company} = { "url" => $url, "dir" => $dir }; } ### end while (<FILE2>) close(FILE2); ### 2. loop through file 1, and for each line in file 1, ### see if there's a matching key(company name) in ### the file2 hash my $file1="/path/to/company_file.dat"; open(FILE1, "<$file1") || die "Can't open $file1: $!"; print "Content-type: text/html\n\n"; print "Company Listings:<BR>"; while (<FILE1>) { my $line = chomp($_); ($file1_company, $file1_url, $file1_dir) = split(/\|/, $line); print "$file1_company, $file1_url, $file1_dir<BR>\n"; if (exists($file2_hash{$company}) { ### found a match ### print qq!$file1_company, $file2_hash{$company}{"url"}, $file2 +_hash{$company}{"dir"}<BR>!; } } ### end while (<FILE1>) close(FILE1);
        Careful, as this code is completely untested - but hopefully it would work without major changes.

        HTH.

Re: Flat File Question
by Art_XIV (Hermit) on Oct 24, 2003 at 19:06 UTC

    If you have to use flat files then you might want to load the second file in advance and place the company name and subdir into hashes, i.e.:
    $company{'company_A'} = '/home/company/etc';

    The company name will be the key and the subdirectory will be the value. If more than one subdir will be needed per company name, then load them into an anonymous array:

    $company{'company_A'}[0] = '/home/company/etc'; $company{'company_A'}[1] = '/dev/null';

    You can then access these values as you traverse the values in your first file.

Re: Flat File Question
by hardburn (Abbot) on Oct 24, 2003 at 18:26 UTC

    Is there any chance that you could move the flat-file to, say, DBD::SQLite? This is a problem that could be solved very naturally by a relational database.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    :(){ :|:&};:

    Note: All code is untested, unless otherwise stated

      Hi Schemer, unfortunately no
        If a SQL approach would help, you can leave the flatfile as is and handle it with DBD::CSV or DBD::AnyData, without having to change your files at all.
Re: Flat File Question
by graff (Chancellor) on Oct 25, 2003 at 15:32 UTC
    As suggested above, you want to use a hash-of-arrays (HoA) to store the "sub-*" records from the second flat file, keyed by company name. It would also be useful for the strings from the first flat file to be stored in a simple hash as well (as you read from that file), again keyed by company name. So, somthing like this to read the two files:
    my %f1hash; my %f2hash; my $f1 = "company_file.dat"; my $f2 = "other_file.dat"; open( F, $f1 ) or die "can't open $f1: $!"; while (<F>) { chomp; my ( $co, $url, $dir ) = split /\|/; $f1hash{$co} = "$co, $url, $dir<BR>\n"; } close F; open( F, $f2 ) or die "can't open $f2: $!"; while (<F>) { chomp; my ( $co, $url, $dir, $mdir ) = split /, */; push( @{$f2hash{$co}}, " -$co $url $dir<BR>\n"; warn "$f2 contains $co, not found in $f1\n" unless exists( $f1hash +{$co} ); } close F;
    Then when you're printing stuff out, do something like this:
    for (sort keys %f1hash) { print $f1hash{$_}; print @{$f2hash{$_}}; }