in reply to merging to databases... Should be easy...

This sounds like a good usage for a hash of hashes. Now, this assumes that every pair of keys is unique. If you can guarantee they'll always be numbers, try using a list of lists.

It'll take a long time, but do something like

my %firstDataSet; foreach (my $line = <FIRST_FILE>) { my @data = split /\s+/, $line; $hash{$data[0]}{$data[1]} = $data[2]; } foreach (my $line = <SECOND_FILE>) { my @data = split /\s+/, $line; if ($hash{$data[0]}{$data[1]}) { print OUT_FILE "$data[0] $data[1] $hash{$data[0]}{$data[1]} $d +ata[2]\n"; } }

------
We are the carpenters and bricklayers of the Information Age.

Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

Replies are listed 'Best First'.
Re: Re: merging to databases... Should be easy...
by runrig (Abbot) on Oct 03, 2001 at 20:28 UTC
    With 500MB files, I'd consider MLDBM if I were to go this route. What I'd probably do (if its an option for the OP) is stick with unix utilities (I'm assuming the fields are space delimited, and you don't mind sorted results):
    sed 's/ /|/' file1 | sort >tmp1 sed 's/ /|/' file2 | sort >tmp2 join tmp1 tmp2 | sed s/|/ /' > result
    A reply falls below the community's threshold of quality. You may see it by logging in.