Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Streaming to Handles (iterator)

by tye (Sage)
on May 05, 2004 at 23:26 UTC ( [id://350948]=note: print w/replies, xml ) Need Help??


in reply to Streaming to Handles

You don't need a stream; you want an iterator (yes, similar term). To turn this into an iterator in Perl5, you need to keep your own "stack". That is easy to do with an anonymous array (or two) inside your object.

I put file names that I have yet to output into @{ $self->{files} } and output the next one from there the next time the iterator is called. I put directory names that I have yet to read the list of files from into @{ $self->{dirs} } and when there aren't any more file names to return, I read the next directory.

First, here is how you'd use my iterator:

#!/usr/bin/perl use strict; use warnings; require List; my $f= List->new( @ARGV ); my $file; while( $file= $f->next() ) { print "$file\n"; }

And here is the code that implements it:

package List; # Terrible name use strict; use warnings; use Cwd qw( cwd ); require File::Spec; use vars qw( $VERSION ); $VERSION = '0.99'; sub new { my( $class, $path )= @_; my $self= { }; if( defined $path ) { $self->look_in( $path ); } bless $self, $class; return $self; } sub look_in { my( $self, $path )= @_; $path= cwd() unless @_ > 1; $path= File::Spec->canonpath($path); $self->{path}= $path; $self->{dirs}= [$path]; $self->{files}= []; } sub next { my( $self )= @_; while( 1 ) { if( @{ $self->{files} } ) { my $file = shift @{ $self->{files} }; if( -d $file ) { push @{ $self->{dirs} }, $file; } return $file; } if( ! @{ $self->{dirs} } ) { return; } my $dir= shift @{ $self->{dirs} }; if( opendir( DIR, $dir ) ) { $self->{files}= [ map { File::Spec->catfile( $dir, $_ ); } File::Spec->no_upwards( readdir(DIR) ) ]; closedir DIR; } else { warn "opendir failed, $dir: $!\n"; } } } 1;

I tested it enough to see that it appears to work just fine.

If you had directories with huge numbers of files directly in them (not in subdirectories), then you might want to make the iterator a bit more complicated such that you don't keep a list of file names and instead return each file name (almost) immediately after you get it back from readdir (but I'm not sure I would recommend that).

- tye        

Replies are listed 'Best First'.
Re: Re: Streaming to Handles (iterator)
by crabbdean (Pilgrim) on May 06, 2004 at 01:25 UTC
    Thanks, I haven't tested this but looking at it, it "makes sense" and on appearance appears to be what I'm looking for. Thanks. BIG GRINS!! ++ I'll test it in the coming days and let you know. I'll post back with my findings/results

    By the way, the package name "List" was only used for this example although I'm unsure of what to call the module. I'm assuming it will come under the "File::" modules and could call it "list" or "DirList" or something. Do you have any good suggestions for a name?

    Thanks once again. :-)

    Dean
    The Funkster of Mirth
    Programming these days takes more than a lone avenger with a compiler. - sam
    RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers
Re: Re: Streaming to Handles (iterator)
by crabbdean (Pilgrim) on May 07, 2004 at 08:04 UTC
    I've written this into my code and it works perfectly! :-) I'll tweak it a bit and get it working with the other features in my module. But that's exactly what I was after. Big ++ !!!

    I'm just running a benchmark now to see a comparasion against the alternative solution of returning files as arrays.

    I intend to leave both methods in the module so it gives the user the choice of streaming or returning via arrays.

    Here are the benchmark results:
    Rate stream array stream 76.1/s -- -14% array 88.7/s 17% -- Rate stream array stream 79.8/s -- -5% array 83.9/s 5% -- Rate stream array stream 72.2/s -- -10% array 80.2/s 11% --
    As you can see returning via array's is faster.

    Once again a big thank you!

    Dean
    The Funkster of Mirth
    Programming these days takes more than a lone avenger with a compiler. - sam
    RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://350948]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2025-05-12 07:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.