Using Nested Iterators to find a Cross Product

The code below demonstrates a technique for nesting iterators (specifically, external iterators) by way of a trivial application, iterating over the cross product of a list of arrays.

I am aware that there is a module on cpan for iteratively generating the cross product of a set of arrays: Set-CrossProduct. My purpose here is to illustrate the principle transparently, in code. Also, Set-CrossProduct only operates on actual arrays*, whereas my class, below, operates on black-box iterators. tye's excellent Algorithm::Loops::NestedLoops has a similar limitation. It can give you an iterator for iterating over the cross product, but it only accepts arrays as input.

My class's interface is modeled on that of the Iterator module, implementing methods new, value, and is_exhausted. However, the specific calling semantics differ slightly from Iterator, as follows.

Where Iterator::new() expects a callback subroutine, my new() takes a list of (one or more) iterator objects. These are the sub-iterators, which will be walked to make the cross product.

Where Iterator throws exceptions of class Exception::Class, mine just throws a string message.

And here's the major difference: Iterator::value() can only return a scalar, while my value() returns a list. In fact, it returns a list which is the concatenation of all the values returned by the sub-iterators. The sub-iterators are, in turn, free to return multiple values. Value lists are treated flatly.

For example, suppose you have three iterators, A, B, and C. And suppose they are defined such that A->value returns a list of two values (j,k); B->value returns a single scalar, m; and C->value returns a list of two values (p,q). Then a cross-product iterator of my class will return a list of five values (j,k,m,p,q).

Note that, in terms of the return value of value(), Iterator::value is simply a special case of my value().

The iterators you pass as sub-iterators to my new() should implement value() as my class does, or as Iterator does. They must also support the is_exhausted() method, as my class and Iterator do.

* That is, arrays, or objects implementing the ARRAY interface via tieing.

{
    package Iterator::Product;

    sub new
    {
        my $pkg = shift;
        @_ > 0 or die "$pkg->new() needs at least one iterator!";
        @_ > 1 or return shift;
        if ( @_ > 2 ) # reduce!
        {
            my $car = shift;
            @_ = ( $car, $pkg->new( @_ ) );
        }
        my $self = bless [@_], $pkg;
        $self->reset;
        $self
    }

    sub reset
    {
        my $self = shift;
        $self->[0]->reset;
        $self->_reset_inner;
        $self
    }

    sub _reset_inner
    {
        my $self = shift;
        $self->[1]->reset;
        $self->[2] = [ $self->[0]->value ];
        $self
    }

    sub value
    {
        my $self = shift;
        $self->is_exhausted and die "past end";
        $self->[1]->is_exhausted and $self->_reset_inner;
        ( @{ $self->[2] }, $self->[1]->value, )
    }

    sub is_exhausted
    {
        my $self = shift;
        $self->[0]->is_exhausted && $self->[1]->is_exhausted 
    }
}
[download]

The class implements an algorithm to iterate over two sub-iterators. To support iterating over more than two, it reduces by replacing the "rest" of the list with a nested iterator over it. That is, NestedIterator( A, B, C ) is rewritten as NestedIterator( A, NestedIterator( B, C ) ).
You could achieve the same effect by calling Iterator::Product->new on only two iterators, reducing a list yourself:

use List::Util qw( reduce );
my $nested_iterator = reduce { new Iterator::Product $a, $b } @sub_ite
+rators;
[download]

Caveat: For all but the first sub-iterator passed in, which is used as the outer-most iterator in the nesting, the lists are iterated multiple times. This class makes no attempt to memoize the results of the inner iterators. If that is deemed a necessary thing to do, it should be handled by the implementations of the inner iterators, not by this class. Therefore, it is critical, for sane operation, that each inner iterator return the exact same sequence of values on every pass, for any number of passes.

Note that this class doesn't really handle the trivial case of iterating over a single iterator properly. It just returns the iterator passed. A production-quality implementation would handle this case better.

The object is represented internally as a blessed array with the following elements:

the first iterator
the second iterator
the "most recent" value from the first iterator. This gets updated (by calling ->value on the first iterator) whenever the second iterator is reset().

To illustrate how the class above would be used, let's create a basic array-iterating class, and find the cross product of those arrays.

{
    package Iterator::Array;
    use Scalar::Util qw(reftype);

    sub new
    {
        my $self = bless {}, shift;
        (reftype($_[0])||'') eq 'ARRAY' or die
            "Iterator::Array->new($_[0]) - argument is not an array re
+f!";
        $self->{'ar'} = shift;
        $self->reset;
        $self
    }

    sub reset
    {
        my $self = shift;
        $self->{'i'} = $[;
        $self
    }

    sub value
    {
        my $self = shift;
        $self->is_exhausted and die "past end";
        $self->{'ar'}[ $self->{'i'}++ ]
    }

    sub is_exhausted
    {
        my $self = shift;
        $self->{'i'} > $#{ $self->{'ar'} }
    }
}
[download]

Sample application:

my $it = new Iterator::Product 
    Iterator::Array->new( ['1  ',' 2 ',] ),
    Iterator::Array->new( ['4  ',' 5 ',] ),
    Iterator::Array->new( ['7  ',' 8 ',] );

while ( ! $it->is_exhausted )
{
    my @x = $it->value;
    print "@x\n";
}
[download]

Output:

In conclusion, I reiterate that my purpose here is to demonstrate how one iterator may "nest" other iterators, using a canonical pedagogical example. The essense of an iterator is to return one thing — the "next" thing — on each request, and so must remember where it left off the last time it was called. In this case, since the real work of iteration is being done by sub-iterators, the iterator only needs to manage how it's calling the subiterators and composing the results from each.

Further reading: How To: Make An Iterator; Re: Variation on the "Higher Order Perl" Iterator Pattern.

Between the mind which plans and the hands which build, there must be a mediator... and this mediator must be the heart.

Comment on Using Nested Iterators to find a Cross Product Select or Download Code