alexiskb has asked for the wisdom of the Perl Monks concerning the following question:

Sorry for the newbie question...
2 files with a common field.
I thought i could just put all the rows into a hash for both tables then just look a hash lookup...
but i seem to be getting memory problems, the process works fine, then just stalls after 1000 records...
May I ask for pointers what i should be looking at doing, ie should i use arrays? hashes? - i dont want to use modules..
thank you!
here is what ive got:

TGR|10 GROUP|www.10group.co.uk#http://www.10group.co.uk#|0121 333 5464 +|johnj beck|info@10group.co.uk| SVG|7 GROUP|www.7group.com|0121 233 1122|tim rice|tim@7.com|
etc... for 4000 records joined to: on field[0]
TGR|10 GROUP|10 GROUP PLC|GB|54|0.40|0.045|200000|GBX| SVG|7 GROUP|7 GROUP PLC ORD|GB|63|1.00|0.35|0.550|5000|GBX|

etc... for 4000 records
and here is my poor excuse for baby perl:
### do original data open(COMPANIES, "+< ./data1") or die "can't open file: $!"; $co = 0; while (<COMPANIES>) { $contents[$co] = $_; $co++; } foreach $record (@contents) { @fields = split(/\|/,$record); $tidm=$fields[0]; etc... $lse{$tidm}++; } close(COMPANIES); ### do data to be merged open(INFO, "+< ./info.csv") or die "can't open file: $!"; $co2 = 0; while (<INFO>) { $contents2[$co2] = $_; $co2++; } foreach $record2 (@contents2) { @fields2 = split(/\|/,$record2); $tidm_inf=$fields2[0]; etc... $merge{$tidm_inf}="$record2"; } close(INFO); ### print merged foreach my $k (sort keys %lse) { print "$merge{$k}\n"; }

Replies are listed 'Best First'.
Re: how to merge similar data
by katgirl (Hermit) on Sep 19, 2002 at 10:09 UTC
    Here we go: (you may find a better way later)
    #!/usr/bin/perl ### do original data open(COMPANIES, "+< ./data1") or die "can't open file: $!"; @contents = <COMPANIES>; close(COMPANIES); chomp(@contents); open(INFO, "+< ./info.csv") or die "can't open file: $!"; @contents2 = <INFO>; close(INFO); chomp(@contents2); ### do data to be merged $merges = 0; foreach $record (@contents) { @fields = split(/\|/,$record); $tidm=$fields[0]; foreach $record2 (@contents2) { @fields2 = split(/\|/,$record2); $tidm_inf=$fields2[0]; if ($tidm eq $tidm_inf){ $merged[$merges] = join("|",(@fields, @fields2[2..8])); $merges++; } } } ### print merged open(FILE,">outfile.dat"); foreach $merge(@merged){ print "$merge\n"; } close(FILE);
      Perfect, thanks katgirl, that works perfectly!
      i shall study your code well. thanks very much indeed.
Re: how to merge similar data
by kabel (Chaplain) on Sep 19, 2002 at 09:30 UTC
    some suggestions:
    if it is similar data then write a function which takes two arguments: a hash reference where you store the data and a file name from which the data comes.
    does the second field belong to the company name or is it standalone in both files?