in reply to sql and hash confusion
It always makes me uncomfortable to see “read the entire file into a ...” because basically what you are doing is using virtual-memory as a(nother) disk file. If you've already got the data as an existing flat-file, it seems to me that you ought to be reading it and processing it “a record at a time.”
Also, don't overlook any possibilities to take advantage of (disk-based) sorting. It's an uncommonly-fast and efficient operation, and when you're dealing with a sorted file you know that all of the records for any particular sort-key value must be adjacent ... or they're not there at all. I have not looked closely at your example so I really don't know if this applies to you. Anyhow, “it was good enough for COBOL and punched-cards and mangnetic tapes,” and those advantages still apply today.
|
|---|