in reply to Re: Moving from hashing to tie-ing.
in thread Moving from hashing to tie-ing.
First, create a pre-process script that parses the huge source file and supporting data file one time. Its job is to index the position of each ID in the file. This information should be stored in a database (DBD::SQLite or some such) or in a serialized datastructure (Storable or some such). What this buys you is the ability to, given an ID - open the 2 files and quickly read in just the record associated with that ID. No searching required and no parsing of non-related IDs necessary.
Second, make a minor modification to the current script that uses the pre-processed index to pull in just the record(s) associated with that ID. Now you can create as complex a datastructure as makes sense and need not constantly re-split.
This ultimately is not what I would like to suggest but given the lack of details it is the best I can offer.
Cheers - L~R
|
|---|