l3nz has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I have started playing around with Perl closures lately following the very good Closure on Closures tutorial by broquaint, and I thought I'd write an iterator to access, one by one, the files recursively contained within a directory path. The problem, when written in a recursive way, is very simple; this is a simple solution (a better solution would avoid returning lists, but this is simple and works):

use strict; sub recursiveReadDir { my ($dir) = @_; my @res; my $f = ''; my $ff = ''; local (*D); print "Reading: $dir ...\n"; opendir(D, $dir)|| die "can't opendir $dir: $!"; while ( defined( $f = readdir( D ) ) ) { $ff = $dir . '/' . $f; next if $f =~ /^\.{1,3}$/; if ( -d $ff ) { push @res, recursiveRead( $ff ); } else { push @res, $ff; } } closedir D; return @res; } print recursiveRead( "c:/inetpub/wwwroot" );
I was thinking about how to convert this code into an iterator built with closures, but I'm frankly clueless at it. Is it possible, feasible and reasonable? Could you explain me how?

Replies are listed 'Best First'.
Re: Turning a recursive function into an iterator
by broquaint (Abbot) on Jan 19, 2004 at 15:26 UTC
    This is the sort of situation that'd be perfect for coroutines, but we'll have to wait 'til perl6 for native support of that. In the mean time you need to unroll your recursion, which involves building a stack to iterate over e.g
    use strict; use File::Spec::Functions 'catfile'; use IO::Dir; sub _dircnts { my $dir = shift; my $do = IO::Dir->new($dir) or die "Couldn't open '$dir': $!"; return map catfile($dir, $_), grep !/^\.\.?$/, $do->read; } sub rec_readdir { my $dir = shift; my @cnts = _dircnts $dir; my $iter; $iter = sub { my $f = pop @cnts; return unless defined $f; push @cnts, _dircnts $f and return &$iter if -d $f; return $f; }; return $iter; } my $di = rec_readdir(shift); while(my $file = &$di) { # do stuff here }
    That basically builds up a listing of a given directory and pushes it on a stack, the stack is then iterated through and if a directory is found then its contents are pushed on the stack and so on.

    Although if you're actually iteratively walking a directory tree then File::Find::Rule will do a much more reliable job (as it in turn uses File::Find) and is oh-so useful :)

    HTH

    _________
    broquaint

•Re: Turning a recursive function into an iterator
by merlyn (Sage) on Jan 19, 2004 at 14:50 UTC
Re: Turning a recursive function into an iterator
by pg (Canon) on Jan 19, 2004 at 15:58 UTC

    For Perl, it is very easy to convert recursive program into its iterator version.

    The only thing you need, in order to accomplish this conversion easily, is 'stack'. Perl by default suppots stack in its core, thanks to the ability to pop() and push() against a list.

    To convert:

    • As a start point, some element(s) need to be processed is push() onto the stack
    • For each iteration:
      • pop() one element out of the stack, if nothing to pop(), the process is finished; otherwise process this element
      • If there are child element(s) need to be processed, push() them onto the stack

Thank you
by l3nz (Friar) on Jan 19, 2004 at 16:10 UTC
    In the end I was able to do more or less what I had in mind at the beginning. This implementation is recursive (I wanted to mix in both things) but behaves like an iterator using the closure: It uses two queues (one for read filenames and another for directories to be read) but when a directory is read it iterates over itself until a file is found; still, the interface is an iterator.
    { my @fq; # files my @dq; # directories my $currentDir; sub initDir { push @dq, shift; }; sub getNextFile { my $f; if ( $#fq >= 0 ) { $f = pop @fq; } else { if ( $#dq >= 0 ) { my $r; # filename read local *D; $currentDir = pop @dq; opendir(D, $currentDir) || die "$currentDir: $!"; while ( defined( $r = readdir( D ) ) ) { next if $r =~ /^\.{1,3}$/; my $ff = $currentDir . '/' . $r; if ( -d $ff ) { push @dq, $ff; } else { push @fq, $ff; } } closedir D; $f = getNextFile(); } else { $f = undef; } } return $f; } } my $dir = "c:/inetpub"; # execution start here initDir( $dir ); my $f; my $i = 0 ; while ( defined( $f = getNextFile() ) ) { $i++; print "[$i] $f \n"; }
    What do you think about it? is there anything fundamentally flawed with this approach?
      To me, recursion means "let the programming language manage the stack", so the use of either an explicit stack or queue bothers me when I'm trying to do things recursive-like.

      So here's a somewhat more fully iterator-style solution that uses the dirhandle as an iterator, and keeps a single scalar as the memory between invocations. Oh, and this isn't totally necessary, but I convert simple files to iterators that return the file once. I'm doing that because whenever I write iterators, I always use pretty much the same pattern: compute the next value if needed, consume a result from a sub-iterator, then return it. This allows a class hierarchy of iterators that override only the minimum necessary operations. Anyway, here's the code:

      use IO::Dir; sub make_reader2 { my ($dirname) = @_; my $d = new IO::Dir $dirname or die "opendir $dirname: $!"; my $next; return sub { while (1) { unless (defined $next) { my $entry = $d->read or return; next if $entry eq '.'; next if $entry eq '..'; $entry = "$dirname/$entry"; if (-d $entry) { $next = make_reader2($entry); } else { $next = sub { my $r = $entry; undef $entry; $r; }; } } my $result = $next->(); return $result if defined $result; undef $next; } }; } my $reader = make_reader2($ARGV[0]); print "$_\n" while (defined($_ = $reader->()));
        Yes, of course; but this was just a way to play around with both ideas and learn something. I would never waste a couple of hours studying a problem that's already solved if it's not to learn something from it :-)