in reply to Suggestions re parsing one large file with elements of another large file

While the files seem pretty big if you look at number of lines, you can gain some comfort in knowing that even at 80 characters per line, and 50% overhead, that ten thousand line file will still only consume 1.2MB of RAM even if you slurped the whole thing into memory. Add to that the 6,000 line file with the same 80 characters and 50% overhead (another 720k), and you're consuming a whopping 1.92MB RAM if you slurp. If the ultimate in quick performance is necessary, and it's not possible to change the datasources to a database, slurp, but do so realizing that you'll run out of memory if those files get too big eventually.

If, on the other hand, if the files may end up growing substantially larger, Tie::File is a good way to go. The POD for Tie::File does state that slurping isn't going on. (The actual text says, "The file is not loaded into memory, so this will work even for gigantic files."

Another possible solution involves changing the design of the data sources. If the two files could be converted to a couple of database tables, scalability will not be a concern anymore.


Dave

  • Comment on Re: Suggestions re parsing one large file with elements of another large file