in reply to Re^5: Directory Tree Structure
in thread Directory Tree Structure

To do list for the following code, in any order. I have tried doing these, but my efforts have lead to bigger messes.

#!/usr/bin/perl use strict; use warnings; use diagnostics; use Cwd; use File::Find; use File::Basename; my $filename = basename($0); my $dirname = getcwd; my %directories = ( '/ftp/pub/www/fantasy' => { link => 'http://www.xecu.net/fantasy', user => 'Fantasy', name => "Fantasy's Realm", }, '/home/lady_aleena/var/www' => { link => 'http://lady_aleena.perlmonk.org', user => 'Lady Aleena', name => "Lady Aleena's Home", }, 'C:/Documents and Settings/<my name>/My Documents/fantasy' => { link => 'file:///C:/Documents and Settings/<my name>/My Documents/ +fantasy', user => '<my name>', name => "<my name>'s Place", }, ); sub get_rootdir { for my $dir (keys %directories) { return $dir if $dirname =~ /^\Q$dir/; } } my $rootdir = get_rootdir; if (not exists $directories{$rootdir}) { die "You really screwed up." +} my $rootlink = $directories{$rootdir}{link}; my $rootuser = $directories{$rootdir}{user}; my $rootname = $directories{$rootdir}{name}; #@ARGV == 1 and -d $ARGV[0] or die "Usage: $0 path_name\n"; #( $rootdir = shift ) =~ s{(?<!/)$}{/}; # make sure path ends with " +/" my %tree; $tree{$rootdir} = {}; find(\&wanted, $rootdir); print_tree( \%tree, 0 ); sub wanted { local $_ = $File::Find::name; if ( -f ) { # only work on data files (skip directories) s{\Q$rootdir\E}{}; # remove the rootdir string from the path name load_tree( $tree{$rootdir}, fileparse( $_ )); } } # recursively load the hash structure # (first call gets top-level hashref, and file name, path from File::B +asename::fileparse) sub load_tree { my ( $href, $name, $path ) = @_; my @dirs = split /\//, $path; push @dirs, '.' if ( $dirs[$#dirs] ne '.' ); my $key = shift @dirs; while ( @dirs and $key ne '.' and exists( $$href{"$key/"} )) { $href = $$href{"$key/"}; $key = shift @dirs; } if ( $key ne '.' and ! exists( $$href{"$key/"} )) { $$href{"$key/"} = {}; load_tree( $$href{"$key/"}, $name, join( '/', @dirs, '' )); } elsif ( $key eq '.' ) { push @{$$href{"$key/"}}, $name; } } # recursively print embedded lists sub print_tree { my ( $href, $indent ) = @_; printf( "%s<ul>\n", ' ' x $indent ); $indent++; if ( exists( $$href{'./'} )) { printf( "%s<li>%s</li>\n", ' ' x $indent, $_ ) for ( @{$$href{'./'}} ); delete $$href{'./'}; } if ( keys %$href ) { for my $subdir ( sort keys %$href ) { printf( "%s<li>%s\n", ' ' x $indent, $subdir); $indent++; print_tree( $$href{$subdir}, $indent ); $indent--; printf( "%s</li>\n", ' ' x $indent ); } } $indent--; printf( "%s</ul>\n", ' ' x $indent ); }

sub transform

sub transform { my ($text) = @_; $text =~ tr/_/ /; $text =~ s/.*\/+//; $text =~ s/\.[^.]*\z//; return $text; }
Have a nice day!
Lady Aleena

Replies are listed 'Best First'.
Re^7: Directory Tree Structure
by graff (Chancellor) on Nov 03, 2009 at 06:00 UTC
    Hi -- Sorry about the slowness to respond to this. I think you're really close here, but I'm not sure how to explain the issue.

    First off, I gather that a bunch of the stuff at the top of this latest version of yours is not relevant to the problem: your "%directories" hash, the "get_rootdir" sub and the $root(link|user|name variables are all no-ops. You are just using File::Find on whatever the current working directory happens to be.

    So, putting all that extra stuff aside, the only change I think you need in the operative code is this:

    sub wanted { local $_ = $File::Find::name; if ( -f ) { # only work on data files (skip directories) s{\Q$rootdir\E[\\/]}{}; # remove the rootdir string from the path +name # ... ALONG WITH THE SUBSEQUENT "\" OR "/" + CHARACTER load_tree( $tree{$rootdir}, fileparse( $_ )); } }
    I think the point here is: once you get into the first subdirectory, removing just the "$rootdir" string leaves $_ with the path separator character at the start. That somehow causes fileparse() to not play nice with the recursive "load_tree" function. Anyway, try that out and see if it helps.

      That got rid of the warning, but did not fix the problem. I think I will need to see the basics, meaning no HTML, so that HTML can be added back in as we go.

      What I need on each line is the file with the full path.

      C:/Documents and Settings/XXXX/My Documents/fantasy/<subdir> C:/Documents and Settings/XXXX/My Documents/fantasy/<subdir>/<filena +me_1>.ext C:/Documents and Settings/XXXX/My Documents/fantasy/<subdir>/<filena +me_2>.ext C:/Documents and Settings/XXXX/My Documents/fantasy/<subdir>/<filena +me_3>.ext C:/Documents and Settings/XXXX/My Documents/fantasy/<subdir_2> C:/Documents and Settings/XXXX/My Documents/fantasy/<subdir_2>/<file +name>.ext
      -or-
      /ftp/pub/www/fantasy/<subdir> /ftp/pub/www/fantasy/<subdir>/<filename_1>.ext /ftp/pub/www/fantasy/<subdir>/<filename_2>.ext /ftp/pub/www/fantasy/<subdir>/<filename_3>.ext /ftp/pub/www/fantasy/<subdir_2> /ftp/pub/www/fantasy/<subdir_2>/<filename>.ext

      After we get the files listed properly, then the HTML can be added back in.

      Have a nice day!
      Lady Aleena
        When you say "got rid of the warning, but didn't fix the problem", do you mean that you are still seeing "empty" lines in the output? I'm not seeing that when I run the version below (sanitized to make it non-dependent on your specific paths).

        I added the minimal amount of stuff around the "print_tree()" function so that I could load it in a browser and even paste it into this validation tool that almut cited elsewhere in this thread; I see no blank lines in the raw or rendered html, and it passed validation (with just a couple warnings about unrelated stuff).

        If you're still seeing a problem with your directory tree, maybe there's something strange in the file names that you are dealing with (e.g. html-reserved characters in a file name, or something like that). In that case, it'll be worthwhile to get a plain-text dump of the tree to see if that's the problem -- or better yet, add HTML::Entities to your program, and use its "encode_entities" method on all the file names.

        In fact, I've gone ahead and done that in the version below, just because it's a good idea anyway.

        (updated to fix validation web site link)