in reply to Re^3: In-place sort with order assignment
in thread In-place sort with order assignment

I've never really tried to optimize code to minimize memory usage, so my thoughts here might be stupid and/or crazy. I'll go ahead and risk being ridiculed and toss out my ideas in case they might spark a better idea from a more experienced programmers.

Marshall, your idea of sorting from a file is close to an idea that I had, but was very hesitant to put it in a post. However, it seems to me that sorting the file(s) as you suggest could potentially eat up a lot of memory. I admit that I could be dead wrong about that.

Here's my stupid/crazy idea that's close to what Marshall suggested:

In other words, instead of doing the sorting after populating the file with all of the hash keys, do the sorting one element at a time as each hash key is added to the file.

I believe that this would sort the keys with minimal memory usage. However, execution time might not be that great or even take too long. Since BrowserUK said that "Speed is not a huge priority here", this might be acceptable depending on how long it takes.

As I said, I have no experience optimizing for minimal memory usage, which means that this could be a horrible idea. I'm open to constructive criticism on this idea, which will help me learn more about optimizing.

  • Comment on Re^4: In-place sort with order assignment

Replies are listed 'Best First'.
Re^5: In-place sort with order assignment
by Marshall (Canon) on Sep 20, 2010 at 03:46 UTC
    In other words, instead of doing the sorting after populating the file with all of the hash keys, do the sorting one element at a time as each hash key is added to the file.
    From what I understand, a huge hash structure already exists and foreach keys %hash makes a list of the hash keys, which essentially doubles the amount of memory required. My question is how to spew all of the keys into a file without making an intermediate structure that contains all of the keys. I suspect that there is a way to do that. If so, the the sort part belongs to another process that will release its memory when done. The Perl hash table assignments of 1,2,3,4 will cause %hash to grow, but only as much as needed and presumably less than 2*storage required for the keys.

      Ah, figured there was something that I either missed or didn't understand. I've never really tried looking under the hood to understand fully what Perl is doing behind the scenes. I didn't realize that the foreach command would make a copy of the data structure.

      Now that you've enlightened me about this, I realize how foolish my idea was. And that leads me back to thinking along the lines of what you had said in the post that I responded too. (By the way, thanks for teaching me something new. I sincerely appreciate you kindly pointing out something that I overlooked.)

         My question is how to spew all of the keys into a file without making an intermediate structure that contains all of the keys.

      Well, here's a thought on that. I'm assuming that there had to have been some Perl code that created the keys in the hash. If you have access to modify that code, modify it so that it's printing to a file instead creating the hash. That leaves the unsorted keys in a file and no initial hash. That in turn frees up more memory for a sorting method, which can be applied as the keys are written to file and/or after all keys have been written to file.

      In other words, "extract" the keys before the hash is created and then do the sort. Then after the sorting is complete, create the hash.

      Of course, if the code where the hash keys are added cannot be modified for some reason, the above idea can't be implemented.

      Does that sound like a reasonable idea or have I missed something else due to my lack of knowledge and experience?