comment on

#!/usr/bin/perl  

use strict;
use warnings;

my %file1_data;
my %file2_data;
 
print "\n\n"; 

# Process data files
ProcessFile( "File1", \%file1_data );
ProcessFile( "File2", \%file2_data );

# Format output
my @keys = sort keys %file1_data;
foreach my $key ( @keys ) {
    
    my $target = exists $file2_data{$key} ?
                 $file2_data{$key}        :
                 '';

    print "$key  $file1_data{$key}  $target\n";

}


# Process a file write data into supplied hash ref.
sub ProcessFile {
    my $filename = shift;
    my $data     = shift;  # Pass data by reference - big hashes used 
+here.

    open( DATAFILE, '<', $filename ) 
        or die "Unable to open $filename - $!\n";

    # Store data in hash.  
    # Only the last instance of any key is stored.
    while ( my $line = <DATAFILE> ) {
        my ($key, $value) = split /\s+/, $line, 2;  # split into max o
+f 2 fields.

        $data->{$key} = $value;

    }

    close( DATAFILE ) 
        or die "Unable to close $filename - $!\n";

}
[download]

As others have said, we only open and read each file once. That is a huge savings.

I put the identical file processing code into a subroutine. I passed a hash reference to the subroutine to speed up data exchange--returning a hash would force a copy of each hash be made as data is passed back from the function. If you prefer to avoid the non-obvious munging of an argument the sub could look like:

my $file1_data = ProcessFile("File1");
my $file2_data = ProcessFile("File2");

# ...

sub ProcessFile {
    my $filename = shift;

    my %data;

    #...
    
    return \%data;
}
[download]

Both are about equally efficient, and which you use is a stylisitic issue.

Keep using strict and the 3 argument version of open. Use the warnings pragma rather than the '-w' switch.

Sorts are expensive, you want to do them the least number of times possible.

You may want to look into how to measure the computational complexity of an algorithm and 'big O notation'. In practice, I have found that the need to do this type of analysis is limited, but learning it will give you an intuitive sense of what types of things are costly--which will improve your algorithms.

Keep up the good work!

TGI says moo

In reply to Re: Perl Code Runs Extremely Slow by TGI
in thread Perl Code Runs Extremely Slow by garbage777

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.