(Stepping back a step: you might see if you couldn't use the standard *NIX join command as this kind of thing is possibly right up its alley. But if you need to do more involved processing later, you probably want to stick with perl . . .)
If your "second" file is small enough you could probably just read over and parse it first and build a hash (%value_for_id) to map from your "id" to the corresponding value (presuming that's a 1-to-1 mapping; more on that later). Then you'd just read through the "main" file and append/insert the value column by retrieving it from that hash. If it's really really big then you could use a database backed hash (DBM_File, GDBM_File) and let it be stored to a file rather than having to keep everything in RAM.
If your mapping isn't 1-to-1 (so for id "AVP78042,AVP78031" there were (say) three values (0.29731, 0.8675309, 0.2112)) you need to figure out what that means. One possible solution would be to output three copies of the line (one with each value). In that case you'd do something similar but build a Hash Of Arrayrefs (similar to what you're doing in your sample) rather than just a flat hash. While you're processing the main file you'd iterate over the matching values with something like:
## presume @row has the current (main) row and we're adding value as l +ast column ## replace with (e.g.) splice or whatever as appropriate for my $value ( @{ $values_for_id{ $id } // [ "-" ] } ) { say join( qq{\t}, @row, $value ); }
Also you may want to look at Text::CSV_XS for parsing your files.
Edit: Derp I completely missed where you gave the numbers of lines; I thought you were talking much bigger files. the reading the first file into a hash approach should be more than fine and shouldn't stress the available memory on any modern box.
The cake is a lie.
The cake is a lie.
The cake is a lie.
In reply to Re: Is it possible to merge data frames with different amounts of rows?
by Fletch
in thread Is it possible to merge data frames with different amounts of rows?
by Mauri1313
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |