in reply to Re: Ordering objects using external index
in thread Ordering objects using external index

Thanks for a comprehensive reply! I'll certainly incorporate some ideas as soon as I'm at work!

Your main suggestion is to keep the hash always up-to-date as I do something on the messages array. That is actually my next big problem :)) You see, the messages can be sorted by different criteria. Currently, there're only eight. So, on each write operation on the messages array I will need to update eight indices. That looks weird.

The main reason to separate sorting order into another array was to be able to save lots of presorted indices (currently they are in memcached) for a big message list and then quickly retrieve messages in the order I need. So the actual events that take place in the script are these: load big array, load indices, try to sort the array in less than n*log(n) ops using the indices. Hope this will clarify my intentions. I can probably try to save both $uids and %uid2msg for each criterium.

Are there any other way to presort array on different criteria and save the order for future reference? Seems like this is my real question :)

Replies are listed 'Best First'.
Re^3: Ordering objects using external index
by fergal (Chaplain) on Sep 07, 2004 at 15:30 UTC
    I replied to this already but something seems to have gone wrong and the reply didn't make it. Basically if you have 8 columns that you need to index then need 8 indexes. No way around it. If you are only retrieving the sorted list once and then forgetting about it forever, then maintaining the indices only slows you down and it's not worth it. However if you are going to retrieve it even just a few times, then it's probably a win.

    You could also try DB_File with it's DB_BTREE functionality to handle the sorting and storing of the arrays. This effectively gives you a sorted hash that persists on disk between calls to your program. You would maintain 8 of these and whenever you add a message, you would do

    tie %index1, "DB_File", "index1", O_RDWR│O_CREAT, 0666, $DB_BTRE +E tie %index2, "DB_File", "index2", O_RDWR│O_CREAT, 0666, $DB_BTRE +E ... sub insert { my $msg = shift; $index1{$msg->key1} = $msg->uid; $index2{$msg->key2} = $msg->uid; ... } my @sorted_by_index1 = @uid2msg{values %index1};
    unlike a normal hash, when you use a DB_BTREE values will give you the values back in the correct order (sort by their keys)

    If you go down this route you are basically implementing your own database and you may want to look at just using DBD::SQLite which gives you a fast, direct to disk database.