dragonchild has asked for the wisdom of the Perl Monks concerning the following question:

I'm building a class that will take a set of iterators and iterate over the combined output, merging records where appropriate. The client passes in a list of iterators and a comparison function for the expected records from the iterators. The iterators are assumed to be returning their values in some sorted fashion.

The problem I'm having is summarized as so:

  1. The iterators are destructive.
  2. Because they're destructive, I need to cache the values that aren't used in a given iteration for use in the next one.
  3. For those iterators that I do use the value this iteration, I need to remove the value in the cache for those iterators. This is so I can get the next value during the next iteration.
  4. The sorting routine (as I've defined it thus far) is expected to accept two records and return -1, 0, or 1 (as appropriate)
  5. But, how do I do something like my @indices = sort $sort_routine 0 .. $#records; when $sort_routine is expecting records, not indices?

The cache is a parallel array of records, each index corresponding to an iterator in @iterators. So, I could do something like:

RECORD: for my $record (@merged_records) { my $index; for $index (0 .. $#cache) { if ($cache[$index] eq $record) { $cache[$index] = 0; next RECORD; } } die "Cannot find record to remove in cache!\n"; }

But, that depends on stringifying the references, and I'm not comfortable with that. Plus, the iterators might not be returning references, so this would fail.

Help!

------
We are the carpenters and bricklayers of the Information Age.

Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Replies are listed 'Best First'.
Re: Sorting for indices and client-defined routines
by Roger (Parson) on Mar 03, 2004 at 23:02 UTC
    Points 1 - 3: Perhaps an Orcish Maneuver might help? Like the example below that caches results as it goes. The concept is similar.
    my %m = (); @sorted = sort { ($m{$a} ||= -M $a) <=> ($m{$b} ||= -M $b) } @files;

    Points 4 & 5: I thought that's called the element ranking?
    my @rank[sort {$x[$a] cmp $x[$b] } 0..$#x] = 0..$#x;

    I would imagine the solution might be a combination of the two algorithms, plus some tweaking.