skywalker has asked for the wisdom of the Perl Monks concerning the following question:

I have written the following code for recreating a given directory structure. (it works fine)

How more efficient can you make it

#!/usr/local/bin/perl use warnings; use strict; my $recreate_to = 'C:/Documents and Settings/My Documents/TEST'; my $topdir = 'H:/'; &dirs($topdir); sub dirs{ my $dir = shift; opendir (DIR,$dir) || die "unable top open dir $dir $!"; my @files = grep(/\w+/,readdir(DIR)); close (DIR); foreach my $subdir (@files){ if(-d $dir.'/'.$subdir){ my $level = $dir . '/' .$subdir; my $newfolder = $level; $newfolder =~ s/$topdir/$recreate_to/; print "New folder: $newfolder\n"; # mkdir ($newfolder); &dirs($level); } } }

skywalker

Replies are listed 'Best First'.
Re: Recursive sub efficiency
by ikegami (Patriarch) on Jun 10, 2010 at 19:28 UTC
    • Speed efficiency is moot, since IO will limit speed.
    • Memory efficiency is as about as good as you can get.

    Bugs should be a bigger concern.

    • You're treating text as a regex pattern in s/$topdir/$recreate_to/.

      s/\Q$topdir/$recreate_to/
    • You're using an unlocalised global (*DIR) in a recursive function. You're not using it in a way that would cause a problem, but why?!

      opendir (local *DIR,$dir) || die "unable top open dir $dir $!"; my @files = grep(/\w+/,readdir(DIR)); closedir (DIR);

      Better:

      opendir (my $dh,$dir) || die "unable top open dir $dir $!"; my @files = grep(/\w+/,readdir($dh)); closedir ($dh);

    Update: s/close/closedir/

      The \Q within the pattern match is something ive not noticed before (Ive just taken a look at http://www.perl.com/doc/manual/html/pod/perlre.html) thanks for that, I have had issues before with replacing general text within a regular expression so that will help me out a lot.

      The only reason that I used an unlocalised global for the directory call was simply due to the size of the script I wrote, I know its not going to be used anywhere else. However Ill take note of your comment for future scripts.

        The only reason that I used an unlocalised global for the directory call was simply due to the size of the script I wrote

        If that's true, why did you use "my" for every other variable except that one?

Re: Recursive sub efficiency
by BrowserUk (Patriarch) on Jun 10, 2010 at 19:54 UTC

    Try:

    xcopy /t /e h:\* "C:\Documents and Settings\My Documents\TEST"

    It usually runs significantly faster than Perl. (You might want to add /o to copy the permissions)


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Man thats quick, ive just run it on a directory with 205 sub folders and its a lot faster than Perl.

      However I wanted to brush up on my Perl skills (or lack there of)
        But you asked to make it more efficient. And calling other programs is very Perlish. Larry didn't call Perl a "glue language" without having a reason. Code reuse is a good thing, and code reuse doesn't stop at Perl code. IMO, a programmer who opts for a Perl solution for the sake of having a Perl solution is still at the newbie stage. system is a very important tool in the Perl toolkit.
Re: Recursive sub efficiency
by SuicideJunkie (Vicar) on Jun 10, 2010 at 19:32 UTC

    Unless your algorithm is broken so much that you somehow visit each node more than once, any code improvements will be insignificant compared to the disk access time.

    If you want it to be faster, try using a ram disk to build your pruned tree and then xcopy /s it to the harddrive in one go when you're done.

    Not worth rewriting it in this case, but in general situations you should consider using an iterative loop instead of recursive calls.

Re: Recursive sub efficiency
by toolic (Bishop) on Jun 10, 2010 at 19:42 UTC
    my @files = grep(/\w+/,readdir(DIR)); close (DIR); foreach my $subdir (@files){ if(-d $dir.'/'.$subdir){
    It looks like you only operate on sub-directories, and you just discard files. I'm not sure if it is more efficient (speed and/or memory) without Benchmarking, but the code would be more straightforward if you filtered out what you didn't need in the grep:
    my @dirs = grep { /\w/ and -d "$dir/$_" } readdir(DIR); closedir (DIR); foreach my $subdir (@dirs){
    Update: Fixed -d and closedir
      Yep your right Im only interested in the folders nothing else.

      However the inline grep and directory check does not return any anything. Ive updated the directory handle to a scaler as per Ikegami's advice.

      my @files = grep(/\w/,readdir($fh)); #returns folder & files my @dirs = grep {/\w/ and -d} readdir($fh); #returns blank
        He meant
        my @dirs = grep { /\w/ && -d "$dir/$_" } readdir($dh);
        which should probably be
        my @dirs = grep { !/^\.\.?\z/ && -d "$dir/$_" } readdir($dh);
        My mistake. I updated my code.

        Comment out your @files line, then @dirs should contain something. You can not call readdir two times like that.

Re: Recursive sub efficiency
by jwkrahn (Abbot) on Jun 11, 2010 at 00:29 UTC
Re: Recursive sub efficiency
by Khen1950fx (Canon) on Jun 11, 2010 at 01:22 UTC
    I can't say that it's more efficient, but it's another way of doing it. I found a script by mirod:
    #!/usr/bin/perl use strict; my $fileName = 'Documents and Settings/My Documents/TEST'; create_dirs( $fileName); open (List, '>', $fileName); sub create_dirs { my $filename= shift; my $dir; while( $filename=~ m{(/?[^/]+)(?=/)}g) { $dir .= $1; next if( -d $dir); mkdir( $dir) or die "cannot create $dir: $!"; } }
      Thats cool, same thing as what I did just in less lines I like it. Can we use even fewer lines of code (not withstanding the 1 liner system call to xcopy)