http://qs1969.pair.com?node_id=125156
Category: Miscellaneous
Author/Contact Info Jonathan Clover
jclovs
Description: I didn't like how Find:File Module worked for me and what I wanted to do so I created my own recursive program to print out a directory structure of my web directory, for new design purposes. So here it is hope someone finds it useful, and feel free to critque.
#!/usr/bin/perl -w

#####################################################
#####Web Site Directory Print                   #####
#####Copyright 2001, Jonathan Clover            #####
#####Feel Free to Redistribute                  #####
#####                                           #####
#####Description: Program to Print out in plain #####
#####text tab-deliminated representation of a   #####
#####directory structure. Has the ability to not#####
#####include any directory, as well as allows   #####
#####for a specific web directory to be         #####
#####specified within the quiry string formated #####
#####like "?dir=clovs.com/about".               #####
#####################################################

use strict;
use CGI qw(:standard);

my ($web_url, $web_dir, $default_dir, $tab, $header, @non_include);

####################
###Configurations###
####################

#Home Page URL
$web_url = "clovs.com";

#Home Web Directory
$web_dir = "/home/clovs/www";

#Default Directory to Start in if you 
#wish it not to be the Home Web Directory
#$default_dir = "/home/clovs/www";

#Tab String to use for print out
$tab = "\ \ \ \ ";

#CGI Header
$header = "text/html";

#Files not to include
#Those that start with . are never included
#for obvious reasons(aka infinite recursion)
#Regex accepted as values in list
@non_include = ('_',          #Front Page Hidden Folders
        'Merchant2',  #The Online Store Data Folder
        'webstats');  #The Web Statstistics Folder

###########################
###End of Configurations###
###########################

my $cur = CGI->new();
my $start;
my $non_include = '^(\.|'.join('|', @non_include).')';

###Allow for Param's from a web interface###
if ( $cur->param("dir") ){
    my ($nothing, $temp) = split($web_url,$cur->param("dir"));
    $start = $web_dir.$temp;
}
else { $start = $default_dir; }

###Calculate the number of tabs to be
###used when printing out results
my @start = split('/', $start);
my $tabs = $#start;

###Start the program and print out as plain text
print header( $header );
print start_html();

dir_tree($start);

print end_html();

###Subrotines Below###
sub dir_tree {
    my $dir = shift;
    my @dir = split('/', $dir);
    print $tab x ($#dir - $tabs),
          $dir[$#dir]."<br>\n";
    if(-d $dir){
        foreach (op_dir($dir)){
            dir_tree($_);
        }
    }
}

sub op_dir {
    my $dir = shift;
    my @dir;
    opendir(DIR, $dir) || die("Couldn't open dir: $!");
    foreach (sort by_lc readdir(DIR)){
        if ($_ !~ m/$non_include/){
            push(@dir, "$dir\/$_");
        }
    }
    return @dir;
}

sub by_lc {
    lc($a) cmp lc($b);
}
Replies are listed 'Best First'.
(crazyinsomniac) Re: Dir Structure Print out
by crazyinsomniac (Prior) on Nov 14, 2001 at 11:58 UTC
    Guy above me points out security as an issue, and I, being who I am, point you to perlsec, and urge you to add -T (right next to that -w, or like -wT) to the list of switches (see perlrun).

    I also like to point out that when you die like you're doing now, the user will get a 500 error, possibly embarrasing whoever decides to use this (user has no idea that's what it's supposed to do when it can't read) ;D. A friendly error message might be in order (see CGI::Carp).

    One more thing, you store under $header = "text/html", which is not neccessary when you use &CGI::header, because that is the default, and there really is no need to keep it in a "separate" variable (you prolly just got a little carried away with the configurating ;D)

    Also, you might wanna add files that begin with . to the list of stuff not ok to see, as well as the actual script that's displaying the directory structure (unless you want it to show up if its there)

    And, you also ought to look into the other parameters for the header method (you might wanna specify an expiration time, like print header(-type=>'text/html', -expires => '+5m');

    And, look into Ovids cgi intro course, and look into

    $CGI::DISABLE_UPLOADS = 1;# Disable uploads $CGI::POST_MAX =-1;# Maximum number of bytes per post
    cause you never know, somebody might decide to mess with you ;D

    Also, since you're going to be using CGI to generate the html, you might as well generate "valid" html, check http://validator.w3.org/ to see about errors, a good starting point is specifying '-dtd'   => "-//W3C//DTD HTML 4.0 Transitional//EN" in start_html.

    I think that's plenty to ponder, but I suggest you go and check out perlsec first, cause it's the most important.

     
    ___crazyinsomniac_______________________________________
    Disclaimer: Don't blame. It came from inside the void

    perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

Re: Dir Structure Print out
by chip (Curate) on Nov 14, 2001 at 07:13 UTC
    Serious security errors:
    • Using form input in a pathname without sanitizing it; specifically:
    • Allowing sneaky examination of the system by user input of paths starting with lots of dotdot entries.

        -- Chip Salzenberg, Free-Floating Agent of Chaos

Re: Dir Structure Print out
by rob_au (Abbot) on Nov 14, 2001 at 18:51 UTC
    In addition to the other comments above, I would like to make the following two small additional comments:

    1. There is a great potential for an endless recursive loop to be established if your program processes a symbolic link to a directory higher in the directory tree - This is a common trap that File::Find-like subroutines get caught by. To correct this, simply add a negative test for symbolic link with your -d test. ie.

      if (-d $dir && !-l _) { ... };
    2. And less importantly, the sort by_lc readdir(DIR) block in the op_dir subroutine could more easily be written as thus:

      foreach (sort { lc($a) cmp lc($b) } readdir(DIR)) { push (@dir, "$dir\/$_") if (m/$non_include/); };

     

    Ooohhh, Rob no beer function well without!

One more time
by jclovs (Sexton) on Nov 15, 2001 at 00:11 UTC
    So I added and subtracted a little and came up with a new version. I took most of the sugestions to heart and hopefully made some better code. I also made a couple of changes of my own, such as test for existance first. Also seeing everyone seemed to think that I allowed for files that start with .'s to be seen are sorely mistaken. As per the comments in the code(I tend to use them widely) files that start with .'s are not allowed to be seen and is atomaticly added to the list in the join function later in the script(check it out I speak the tructh). The End!
    #!/usr/bin/perl -wT ##################################################### #####Web Site Directory Print ##### #####Copyright 2001, Jonathan Clover ##### ##### ##### #####Description: Program to Print out in plain ##### #####text tab-deliminated representation of a ##### #####directory structure. Has the ability to not##### #####include any directory, as well as allows ##### #####for a specific web directory to be ##### #####specified within the quiry string formated ##### #####like "?dir=/about". ##### ##################################################### use strict; use CGI; $CGI::DISABLE_UPLOADS = 1; $CGI::POST_MAX = 51_200; my ($web_dir, $default_dir, $tab, @non_include); #################### ###Configurations### #################### #Home Web Directory $web_dir = "/www2/nati"; #Default Directory to Start in if you #wish it not to be the Home Web Directory #$default_dir = "/www2/nati"; #Tab String to use for print out $tab = "\&nbsp;\&nbsp;\&nbsp;\&nbsp;"; #Files not to include #Those that start with . are never included #for obvious reasons(aka infinite recursion #and people trying to %#&@ with the script) #Regex accepted as values in list @non_include = ('_', #Front Page Hidden Folders 'Merchant2', #The Online Store Data Folder 'webstats'); #The Web Statstistics Folder ########################### ###End of Configurations### ########################### my $cur = CGI->new(); my $start; my $non_include = '^(\.|'.join('|', @non_include).')'; ###Allow for Param's from a web interface### if ( $cur->param("dir") ){ my ( $temp ) = ( $cur->param("dir") =~ /([\w\/]+)$/ ); $start = $web_dir.$temp; } else { $start = defined($default_dir)? $default_dir : $web_dir; } ###Calculate the number of tabs to be ###used when printing out results my @start = split('/', $start); my $tabs = $#start; ###Start the program and print out as plain text print "Content-Type: text/html\n\n", "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0 Strict//EN\">\n", "<html>\n", "<head>\n", "\t<title>Dir Structure Print</title>\n", "</head>\n", "<body>\n"; dir_tree($start); print "</body>\n</html>"; ###Subrotines Below### sub dir_tree { my ($dir) = @_; my @dir = split('/', $dir); if(-e $dir){ print $tab x ($#dir - $tabs), $dir[$#dir]."<br>\n"; } else { print $dir[$#dir]." does not exist"; } if(-d $dir && !(-l $dir)){ foreach (op_dir($dir)){ dir_tree($_); } } } sub op_dir { my $dir = shift; my @dir; opendir(DIR, $dir) || die "Couldn't open dir: $!"; foreach (sort {lc($a) cmp lc($b)} readdir(DIR)){ if ($_ !~ m/$non_include/){ push(@dir, "$dir\/$_"); } } return @dir; }
    Clovs aka jclovs
    $_=crypt("hssq","cr");m-[funki.g.jim.bed.wax]-i;$_=eval"$`\(\"czEW\",\ +"pr\"\).$`(\"CCSBD\",\"Cl\")";s+ltO8f+ +;s=kt|g|m.|YA=\"=g;s|[ej]|\"\ +.\"|g;eval;