in reply to Quick file search

It would be interesting to see how much faster the win/dos "tree" utility is than the File::Find module (or any of its relatives on CPAN). I know that in unix, File::Find takes much longer than the standard "find" utility for a directory tree of any considerable size.

The only problem with the OP code is that you have to remember to run "tree" yourself -- and save its output in a specific file whose name is hard-coded in your perl script -- before running the script, so that you know your file list is up-to-date. But that's unnecessary -- just run "tree" inside the perl script (it'll still run just as fast):

my $music_path = ".\\music"; # probably should make this an absolute +path... open my $fh, "D:\\tree /f /a $music_path |" or die "cannot run tree on $music_path: $!"; while ( <$fh> ) { # everything else stays the same... }
(not tested -- I'm not a windows user)

Actually, it might also be interesting to see whether a win/dos port of the unix "find" utility is slower or faster than "tree"...

Replies are listed 'Best First'.
Re^2: Quick file search
by zentara (Cardinal) on Jan 05, 2006 at 11:46 UTC
    I know that in unix, File::Find takes much longer than the standard "find" utility for a directory tree of any considerable size.

    This was discussed in considerable length in Myth busted: Shell isn't always faster than Perl and was wondering if you can back up the claim of "much longer" . My tests show differences of only a fraction of a second on 100 meg tree.


    I'm not really a human, but I play one on earth. flash japh
      The evidence I've gathered about unix "find" vs. File::Find is at least a couple years old, and I haven't checked to see whether there has been an update to File::Find since then, nor tested it more recently to see whether it has improved within the last couple years. (I assume unix "find" remains about the same.)

      In any case, I posted some benchmarks here and here. Looking at those again just now, my statement above about "5- or 6-to-1" should have simply been "5-to-1".

Re^2: Quick file search
by sh1tn (Priest) on Jan 05, 2006 at 05:36 UTC
    The lack of unix "find" utility is enormous but "tree" filtered results solve to some extent this problem:
    # rough comparison D:\music>perl -MFile::Find -e "find(sub{$File::Find::name}, '.');print + time-$^T" 6 D:\music>perl -e "`tree /f /a`;print time-$^T" 1
    Where "music" contains 5547 files (without the directory inodes).


      The lack of unix "find" utility is enormous

      I'm wondering what you mean by that. There are windows ports of "find" available (google "unix tools for windows"); the ATT Research Labs version and the cygwin version are both authentically "unix-like" (or maybe "gnu-ish" is the better term).

      "tree" filtered results solve to some extent this problem

      Oh yes. That seems consistent with what I've seen in the unix domain -- about a 5- or 6-to-1 speed ratio comparing File::Find to "find". For really big directory trees, that multiplier becomes devastatingly significant.

        One thing to note is that you need to put the Cygwin before Windows in the PATH for this to work, since there already is a "find" (although it's really a rather weak "grep") on the system.

        /J

        No need for tool from the "outside" world, maybe just:
        my $search_what = shift || die "no search criteria provided\n"; my $search_where = shift; my $file = qr/\|(?!\+\S{3})(\s.*?)$argv/i; my $clean = qr/^\W+(.+?)\W+$/; $dir = -d $dir ? $dir : '.'; for(`tree /a /f $search_where`){ /(\\|\+)-{3}/ and $dir= $_ and next; if (/$search_what/) { s/$clean/$1/ for $_, $dir; $dir ? print "dir: $dir$/","file: $_$/$/" : print "dir: . $/","file: $_$/$/"; } }