in reply to Re^2: Perl Code Runs Extremely Slow
in thread Perl Code Runs Extremely Slow

Well, doesn't everyone have an S390?

Assuming that all his keys are unique, you are right, the OP will need to use some sort of disk based storage--he probably doesn't have enought RAM. Perhaps an SQL database (SQLite anyone) or a dbm file would do the trick.

Any reason why tying each hash to a DB_File would be a bad idea? The largest datasets I've had to deal with have only been in few 10s of megabytes in size. So, is DB_File up to the task?

Would you care to offer a suggestion?

Edit: Upon reviewing the OP's code, it looks like he is keeping both hashes in RAM at the same time (at least the worst case memory consumption for my code and the OPs is the same). If his code runs to completion as posted, so will mine. I also noted your suggestion to try DB_File, and so struck my request for suggestions.


TGI says moo

Replies are listed 'Best First'.
Re^4: Perl Code Runs Extremely Slow
by samtregar (Abbot) on Jun 15, 2006 at 20:57 UTC
    Review again - you are incorrect. The op loads all of file 2 into memory (each time he reads a line from file 1!) but never loads all of file 1. He makes it look like he does by putting lines from file 1 in a hash, but that hash is local to the while block and thus never contains more than one line. Since file 2 is much smaller than file 1 I think it's quite reasonable to assume that while he might fit file 2 in memory+swap that's unlikely to work with file 1.

    DB_File might work well enough, but it's hard to know. A lot depends on his definition of "fast enough" and the actual compositon of his data.

    -sam

      Double d'oh! I guess I need to go reread Coping with Scoping.

      I think it is almost certain that this code does not return the information that the OP thinks he is getting.


      TGI says moo