in reply to sorting an array of hashes by the value of the keys

I'm going to go out on a limb and guess that you're reading a bunch of bioinformatics data. I'm also going to guess that this data is (or can be easily) sorted at your source. I'm also guessing that you want to just get this information.

Given all of that, you probably should only read the info from the file that you need, then do your calculations. After that, discard the data you've gotten and go to the next group. (This will reduce your memory needs by at least 90%, given a typical bioinformatics dataset.)

Now, if you do that, you don't need to split your initial key into four keys. In fact, since all the data you're working with has to do with those, you don't need to use it as a key at all. So, all you have is your topic, your index, and your value - a standard 2-D data structure. Try working with that and see how far you get.

If your data isn't sorted, I would sort it first and save that to some file (or set of files). Then, use that as your data source for summations and the like.

------
We are the carpenters and bricklayers of the Information Age.

The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

  • Comment on Re: sorting an array of hashes by the value of the keys