in reply to load a file in memory and extract parts

In general, when you want to use keys from one file to lookup values in another file, you load the keys from one file into a hash as its keys, then loop through the other file checking to see if each line's key exists in the hash, and doing something with it if it is. Unless there's a reason to do otherwise, it's usually best to load the smaller file (in this case your 5K one) into the hash, then loop through the other file. So in pseudo-code:

open 5k file foreach line get key from line and put it in hash as key=1 close 5k file open 100M file foreach line get key from line if key is in hash from other file do stuff with the line close 100M file

Once you have some code which attempts to do that, show it to us along with a few lines of sample input and output data, and we can guide you further if you need it.

Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.

Replies are listed 'Best First'.
Re^2: load a file in memory and extract parts
by afoken (Chancellor) on May 06, 2015 at 06:10 UTC

    Tux seems to be offline, so I'll link to Text::CSV_XS:

    • Text::CSV_XS takes care of reading and writing CSV files. Unlike most "five lines of perl" attempts, it handles most, if not all, nasty edge cases.
    • DBD::CSV sits on top of Text::CSV_XS and allows SQL access to CSV files. It may be slower than SQLite proposed by locked_user sundialsvc4, but avoids converting CSV to SQLite.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)