in reply to Re^2: file merge problem
in thread file merge problem
You might want to look again at Samy_rio's solution above -- with a little tweaking, you could address the duplicate issue. I don't think that you will have a problem with orphaned records using the proposed file concatenation.
If you open the first file, step through it and create a lookup hash, then close and re-open the first file for appending and the second file for reading, you can simply check each line of the second file to see if it is already in your lookup hash before appending it to the first file.
You don't tell us what impact the additional words have in terms of the logic of your program -- if they are simply extra fields that you want to bring forward, then you can simply adjust your existing regular expression from (\d*) to (\d+.*) to include everything to the end of the line, something like this:
if (/^(\w*) (\d+.*)$/) { unless (exists($lookup{$1}) && ($lookup{$1} eq $2)) { print F1, $_; } } } else { warn "Invalid input line $.: $_\n"; }
If the additional words in the line have some impact on your program's logic (e.g., if you need to parse them off and possibly create new records in your resulting file) then you need to tell us.
Hope that helps. :)
Update: added 'exists' logic to check in lookup hash.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: file merge problem
by Anonymous Monk on Dec 09, 2005 at 14:56 UTC | |
by ptum (Priest) on Dec 09, 2005 at 17:24 UTC |