pawansharma01 has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have to recursively grab all files from a dir tree. The dirs will get updated with new files. The next time, I need to grab only the new files not the files I have already grabbed. How do I go about doing this looking for suggestions. Thanks.

  • Comment on getting files recursively from a directory structure

Replies are listed 'Best First'.
Re: getting files recursively from a directory structure
by Laurent_R (Canon) on Nov 13, 2014 at 22:40 UTC
    Hi, keep somewhere (say in a configuration file) the information about that last time you've run the job (or, more precisely, the last time stamp you've used). Anything newer than that time stamp should be considered, anything older can be discarded. Update the config file with the new time stamp when you are done.

    Update: Loops was faster than me by a couple of minutes. Before answering the OP, I also considered the same solution as Loops's, but decided the one I suggested above was simpler. Now choosing between the two methods would probably require a bit more knowledge about the process.

Re: getting files recursively from a directory structure
by Loops (Curate) on Nov 13, 2014 at 22:38 UTC

    Keep a list of files you've already got, only download files not yet on your list? (or compare to your downloads directory)

Re: getting files recursively from a directory structure
by RonW (Parson) on Nov 13, 2014 at 23:50 UTC

    Expanding on both suggestions, above, you can use File::Find to search for the files. If you go with Laurent_R's suggestion to track the files by time stamp, File::Find can also filter the files to only those newer than the time stamp.

Re: getting files recursively from a directory structure
by karlgoethebier (Abbot) on Nov 14, 2014 at 12:42 UTC

    You could make it so:

    #!/usr/bin/env perl use strict; use warnings; use IO::All; use Storable; use Set::Scalar; my $dir = shift || die $!; my $io = io($dir); my @listing = map { $_->name } $io->all_files(0); my $file = qq(./listing.dat); if ( !-e $file ) { store \@listing, $file; exit; } my $old_listing = retrieve($file); my $new_set = Set::Scalar->new(@listing); my $old_set = Set::Scalar->new(@$old_listing); my $difference = $new_set - $old_set; my $union = $old_set + $new_set; @listing = $union->elements; store \@listing, $file; print join qq(\n), $difference->elements; __END__

    Please see also IO::All, Storable, Set::Scalar, map, join and perlreftut.

    Regards, Karl

    Edit: added shebang.

    «The Crux of the Biscuit is the Apostrophe»