I wrote an (undocumented) Directory parsing module recently - it does what you need and probably a lot more.

Feel free to plagiarize, or use as-is.

Here is how I have used it...

my $dir = new Directory(<Path to directory>); ... if ($ShowDir and my $d = $ShowDir->MatchFileReturnDir($title)){ # This file already downloaded ... $ShowDir->Create_Intermediate_Directories_If_Necessary( $_->{URL} ); } $dir->print ({HEADERONLY => 1}); $dir->print({RECURSE=>1, DETAILS=> 0, HEADER=> 0}); # How much Disk ar +e we eating
The module..
{ # Start of Package Directory --- #----------------------------- package Directory; sub new { my $class = shift; my $curdir = shift; my $self = {DIRPATH => $curdir, SUBDIRS =>[], FILES =>[]}; AddSubdir($self, $curdir); return bless $self,$class; } sub AddSubdir{ # Adds one or more DIR object to $self->{SUBDIRS} my $self = shift; my $curdir = shift; # Full Path my @subdirs; # Only one-level deep opendir(DIR, $curdir) || die "can't opendir $curdir: $!"; local $_; # Should not be necessary, but we are recursive while ( $_ = readdir(DIR)){ next if /^\./; if ( -f "$curdir/$_"){ push @{$self->{FILES}}, $_ ; $self->{BYTES} += -s _; my ($extension) = m/\.(\w+)$/; $extension ||= "<NONE>"; $self->{EXTENSIONS}{uc $extension}++; next; } push @subdirs, "$curdir/$_"; # Just collect them for now.. } closedir DIR; # Recurse descend for (@subdirs){ # Converted collected PATHs into OBJECTS push @{$self->{SUBDIRS}}, new Directory ($_); } } sub MatchFileReturnDir{ # Return the Dir object that contains the passed file name my ($self, $lookfor, $optref) = @_; $optref ||= {RECURSE=>1}; # Recurse=true, by default $lookfor = CompressForMatching( $lookfor ); # Just the File name, + no path for my $orig_filename (@{ $self->{FILES} }){ my $f = CompressForMatching( $orig_filename ); $f=~/^$lookfor/i and return $self; } # Did not find in my own files .. recurse in .. return undef unless $optref->{RECURSE}; for my $subdir( @{ $self->{SUBDIRS} } ){ my $found = MatchFileReturnDir($subdir, $lookfor); $found && return $found; } return undef; # Not Found } sub MatchDirReturnDir{ # Return the Dir object that contains the passed Dir name my $self = shift; my $lookfor = shift; # Just the Dir Name my $optref = shift || {RECURSE=>1}; # Recurse=true, by default $lookfor = CompressForMatching( $lookfor, $optref ); for my $d (@{ $self->{SUBDIRS} }){ my $f = $d->{DIRPATH}; $f =~s|.+/||; # Delete all before last slash $f = CompressForMatching( $f, $optref ); $f=~/^$lookfor/i and return $d; } # Did not find a match in My Subdirs - dig deeper return undef unless $optref->{RECURSE}; for my $d (@{ $self->{SUBDIRS} }){ my $subdirmatch = $d->MatchDirReturnDir($lookfor, $optref); return $subdirmatch if $subdirmatch; } return undef; # Not Found } sub CompressForMatching{ my $lookfor = shift; my $optref = shift || {}; $optref->{EXACTMATCH} and return $lookfor; $lookfor =~s/^(?:The|A)[\s\.]//i; ## Remove Leading "The " or "A + " =~s/^The\s//i; $lookfor =~s/[^\w\s\.\-].*//; # Trailing crap, starts with "non- +word" $lookfor =~s/[\.\-\s]//g; # Compress spaces, dots return $lookfor; } sub Create_Intermediate_Directories_If_Necessary{ my ($self,$path) = @_; my @path_piece = split "/",$path; pop @path_piece unless chop($path) eq "/"; my $curdir = $self; for my $piece(@path_piece){ next unless length($piece) > 0; # SPLIT leaves 0-length ar +tifacts my $newdir = $curdir->MatchDirReturnDir($piece, {RECURSE=>0, EXACTMATCH=>1}); $curdir = $newdir, next if $newdir; # it already exists # We need to CREATE the physical dir + the subdir obj my $newpath = $curdir->{DIRPATH} . "/" . $piece; #print "Creating dir $newpath;\n"; # Debugging only.. mkdir $newpath or die "Cant create dir $newpath: $!"; push @{$curdir->{SUBDIRS}}, new Directory ($newpath); redo; # This time, it will set $curdir to the newly added +subdir } # End for $piece } sub print{ my $self = shift; my $local_option = shift; # A hashref of options $local_option->{HEADERONLY} and $local_option->{HEADER} = 1; my @fields = qw[FILES BYTES EXTENSIONS SUBDIRS DIRPATH]; print join("\t",@fields),"\n" if $local_option->{HEADER}; return if $local_option->{HEADERONLY}; for (@fields){ my $v = $self->{$_} || 0; ref($v) eq "ARRAY" and $v = scalar @$v; ref($v) eq "HASH" and $v = scalar keys(%$v); ($v =~ /^-?\d+$/) and $v = format_kilo($v, 6); # Number f +mt print "$v\t"; } print "\n"; if ( $local_option->{DETAILS}){ my $to_print; # Optimize, so we don't print unless there i +s info. for (sort keys %{ $self->{EXTENSIONS} }) { $to_print .= $self->{EXTENSIONS}{$_} . " $_;\t" } $to_print and print "\tExtension file counts: $to_print\n" +; $to_print = undef; for (sort @{ $self->{FILES} }) { $to_print .= "$_;\t" } $to_print and print "\tFILES: $to_print\n"; } return unless $local_option->{RECURSE}; $local_option->{HEADER} = 0; # Dont want to reprint header for + subdirs $_->print ($local_option) for @{$self->{SUBDIRS}}; } ############################################################### sub format_kilo # Kilo, mega and gig { my $number = shift; my $fixwidth = shift; my $suffix = " "; if ($number > 0x40000000) { $number /= 0x40000000; $suffix = 'G'; } elsif ($number > 0x100000) { $number /= 0x100000; $suffix = 'M'; } elsif ($number > 0x400) { $number /= 0x400; $suffix = 'K'; } # Split integer and decimal parts of the number and add commas my $integer = int($number); my $decimal = int($number * 10 % 10); $suffix ne " " and $integer .= "." . $decimal; # Add Leading spaces if fixed width $fixwidth and $integer = ' ' x ($fixwidth - length($integer) +- length($suffix)) . $integer; # Combine it all back together and return it. return $integer.$suffix; } ########################### 1; } #End of Package Directory --- ###############################################################

     Have you been high today? I see the nuns are gay! My brother yelled to me...I love you inside Ed - Benny Lava, by Buffalax


In reply to Re: Parsing a directory by NetWallah
in thread Parsing a directory by gw1500se

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.