in reply to Re: Efficiently parsing a large file
in thread Efficiently parsing a large file

I agree with Neil: Very clever

I am a bit nervous about putting it into a hash that isn't tied to a file because of the amount of memory involved (hundreds of megs for the file). I recommend tying the hash to a file (gdbm or similar) so you can still access it without eating up all the memory on the box.

hope this helps

Jason L. Froebe

No one has seen what you have seen, and until that happens, we're all going to think that you're nuts. - Jack O'Neil, Stargate SG-1

  • Comment on Re: Re: Efficiently parsing a large file

Replies are listed 'Best First'.
Re: Re: Re: Efficiently parsing a large file
by Art_XIV (Hermit) on Apr 08, 2004 at 21:14 UTC

    I think that the decision to tie the hash or not depends not upon the size of the file that is being read but more upon what percentage of the file being read has meaningful entries.

    If the majority of the entries in the file being read are just junk that can be ignored, then the state-hash can probably be maintained in memory w/o tying.

    If the data source is very rich, though, then it would we wise to tie the state-hash to another file and manage the entries.

    Hanlon's Razor - "Never attribute to malice that which can be adequately explained by stupidity"