Iterate Recursively Through Directories

efaden has asked for the wisdom of the Perl Monks concerning the following question:

Hey All,

So I am trying to write a script to iterate over a set of files and run some commands. The directory/file structure is

Dir1/001.jpg
Dir2/002.jpg
Dir2/001.jpg
Dir2/001.cr2
[download]

etc. Basically each folder will have files from 001.ext, ... , n.ext. I would like to iterate over the files and run a program on each of them to make a thumbnail. I, however, only want to run 1 time per file independent of the ext... e.g. I would like to know that 001 has a cr2 and a jpg, but only want to make one thumbnail.

Does this make sense?

Right now I am using File::Find::Rule; and then getting all cr2, jpg, etc.. within the directory.

What is the best way to do this?

This is my current script that I want to modify so it only prints one thumbnail per image (even if there is a JPEG and a RAW)

#!/usr/bin/perl

# Install ffmpeg, ufraw, ImageMagick

print "Thumnailer, pix2tn\n\n";

my $filename = 'index.html';

$start_dir = shift || '.';

use File::Find::Rule;
  # find all the .pm files in @INC
  my @files = File::Find::Rule->file()
                              ->name( '*.jpg', '*.avi', '*.raw', '*.cr
+2', '*.jpeg', '*.nef', '*.mov')
                              ->in( @INC );

@nfiles = grep(!/AppleDouble/, @files);

my $tnperrow=4;   # thumbnails per row
my $tnsize=200;    # size of thumbnails
my $tnquality=40; # quality of thumbnails [0..100]
# use small thumbnails with poor quality to speed up your index-page

if (-e $filename) { 
  rename $filename, $filename.".bak"; 
  print "I saved the old $filename as $filename.bak\n"; 
  }
open (PAGE, ">$filename") || die "Problem: Can't write to filename\n";

# create a directory for the thumbnails
system ("mkdir tn") if (!-d "tn");
#system ("mkdir med") if (!-d "med");

# create the index page
print PAGE qq*
<html><head><title>$title</title></head>
<body bgcolor=white><h1>$title</h1>
<table cellspacing=10 width="100%">
*;

my $counter=0;

foreach $_ (@nfiles) {
  $in = $_;
  $out = $_;  
  $out =~ s/\//-/g;
  $out =~ s/\.avi$/.jpg/g;
  $out =~ s/\.cr2/.jpg/g;
  $out =~ s/\.nef/.jpg/g;
  $out =~ s/\.mov/.jpg/g;
  print $in;

  if ($in =~ /\.avi$/) {
    system ('convert', '-resize', $tnsize."x".$tnsize, '-quality', $tn
+quality, $in.'[1]', 'tn/'.$out) == 0
      || die "Problems with convert: $?\n";


      #system ('convert', '-geometry', $medsize."x".$medsize, '-qualit
+y', $medquality, $in.'[1]', 'med/'.$out) == 0
      #|| die "Problems with convert: $?\n";
  
  } elsif ($in =~ /\.mov/) {
     system ('convert', '-resize', $tnsize."x".$tnsize, '-quality', $t
+nquality, $in.'[1]', 'tn/'.$out) == 0
      || die "Problems with convert: $?\n";
  } else {
      system ('convert', '-resize', $tnsize."x".$tnsize, '-quality', $
+tnquality, $in, 'tn/'.$out) == 0
      || die "Problems with convert: $?\n";
  }

print PAGE "<tr valign=bottom>" if (!($counter++%$tnperrow));
  print PAGE "<td>";


    #<a href="med/$out"><img src="tn/$out" alt="click to enlarge"></a>
+<br>

  @stat = stat $_;
  print PAGE qq*<center>
    <img src="tn/$out" alt="click to enlarge"><br>*;

  if ($in =~ /\.avi$/) {
    print PAGE "<b>AVI</b><br>";
  } elsif ($in =~ /\.mov/) {
    print PAGE "<b>MOV</b><br>";
  } elsif ($in =~ /\.cr2/) {
    print PAGE "<b>CR2 (RAW)</b><br>";
  } elsif ($in =~ /\.jpg/) { 
    print PAGE "<b>JPG</b><br>";
  }

  print PAGE qq*
    <small><b>$_</b><br>*.
    localtime($stat[9]).<br>.
    qq*</small>\n*;
  print " ... done\n";
}

print PAGE qq*
  </table><hr>
  Index created on *. localtime(time) .qq*
  </body></html>
*;

close PAGE;

exit;
[download]

Comment on Iterate Recursively Through Directories Select or Download Code

Replies are listed 'Best First'.
Re: Iterate Recursively Through Directories by moritz (Cardinal) on Feb 09, 2014 at 18:17 UTC
FWIW if the directory structure is always `DIR$N/001.$EXT`, it has a fixed depth and you don't even need File::Find(::Rule), a simple glob is enough: `use strict; use warnings; my %seen; while (my $file = glob 'Dir/') { (my $base = $file) =~ s/\.\w+$//; next if $seen{$base}; # work with $file here }` [download] Perl 6 - the future is here, just unevenly distributed	[reply] [d/l] [select]
Re^2: Iterate Recursively Through Directories by tbone654 (Beadle) on Feb 11, 2014 at 20:18 UTC
I don't know if this helps for anything, this doesn't Iterate Recursively as shown, but I use readdir in the following code to create an array $dots of filenames from a directory, excluding any "." prefix in unix. Then I can grep out anything I don't want to operate on, etc. Or in this case put the contents of all the files into a single array $foo, then grep out or split out whatever I'm looking for. I think it can be modified to do something useful without a module. `$sd = "../data/data_forthis"; opendir( DIR, $sd) \|\| die; while( ($filename = readdir(DIR))){ next if "$sd\/$filename" =~ /\/\./; push @dots, "$sd\/$filename"; } ## end of while @dots = sort @dots; closedir(DIR); for(my $a=0;$a<@dots;$a++){ open (FILE, $dots[$a]); push @foo, <FILE>; if ($a+1 eq @dots) { close FILE; open (FILE, $dots[$a]); push @foo2, <FILE>; } ## end of if close FILE; } ## end of for ### At this point @foo has everything and $foo2 has just the last file + in the directory...` [download]	[reply] [d/l]
Re: Iterate Recursively Through Directories by Kenosis (Priest) on Feb 09, 2014 at 18:04 UTC
Consider using a hash of arrays (HoA) to track the partial paths (that includes all up to the file extension) and their associated types. For example, given your dataset, the hash would be: `'Dir1/001' => ['jpg'], 'Dir2/002' => ['jpg'], 'Dir2/001' => ['jpg', 'cr2']` [download] This way you can track 'uniqueness' and the file types of the images: use strict; use warnings; use File::Find::Rule; use File::Basename; my %hash; my $dir = '.'; my @files = File::Find::Rule->file()->in($dir); for my $file (@files) { if ( my ( $partialPath, $type ) = $file =~ /(.+)\.([^.]+)/ ) { push @{ $hash{$partialPath} }, $type; } } for my $partialPath ( keys %hash ) { #my @fileTypes = @{ $hash{$partialPath} }; my $fileTypes = $hash{$partialPath}->[0] ? "@{ $hash{$partialPath} + }" : 'None'; my $baseFilename = basename $partialPath; # make thumbnail # $partialPath contains all up to trailing dot and extension # @fileTypes contains the file type(s) # $baseFilename contains the base file name w/o the extension print "Partial Path: $partialPath\n"; print "File Type(s): $fileTypes\nBasename: $baseFilename\n\n"; } [download] Hope this helps!	[reply] [d/l] [select]
Re^2: Iterate Recursively Through Directories by efaden (Initiate) on Feb 09, 2014 at 18:06 UTC
I was thinking about something like that... but I want to do one other thing that doing it that way makes it hard to do... I want to mark the image as having a CR2, JPEG, etc so I know that the image had a RAW... I just don't need two thumbnails. Make sense? -Eric	[reply]
Re^3: Iterate Recursively Through Directories by Kenosis (Priest) on Feb 09, 2014 at 18:28 UTC
Have updated my answer.	[reply]
Re^4: Iterate Recursively Through Directories by efaden (Initiate) on Feb 09, 2014 at 22:10 UTC
Re^5: Iterate Recursively Through Directories by Kenosis (Priest) on Feb 09, 2014 at 22:13 UTC
Re: Iterate Recursively Through Directories by kevbot (Vicar) on Feb 09, 2014 at 18:18 UTC
I threw together a solution that is very similar to what Kenosis has suggested. This code creates a %file_info hash. In the code below, the last file name encountered for a given prefix gets stored in the hash. If you want to do other things (like keep track of file type, etc.) then you could modify the %file_info hash. Basically, you want to create a data structure that contains all the information you want to track. #!/usr/bin/env perl use strict; use warnings; use Path::Tiny; use File::Find::Rule; my $dir = shift or die 'No directory given.'; my $dir_path = Path::Tiny->new($dir); my @files = File::Find::Rule->file()->in( $dir_path ); my %file_info; foreach my $file (@files) { my $file_name = path($file)->basename; my $fn_without_ext = $file_name; $fn_without_ext =~ s/\..{3}$//; $file_info{ path($file)->dirname }{ $fn_without_ext } = path($file +)->basename; } foreach my $found_dir (keys(%file_info)) { my $file_name = (values %{$file_info{$found_dir}})[0]; my $file_to_process = path($found_dir, $file_name); #process the file here (e.g. use the 'system' command) print "Processing this file: $file_to_process\n"; } exit; [download]	[reply] [d/l]


laziness, impatience, and hubris
	PerlMonks