thewalledcity has asked for the wisdom of the Perl Monks concerning the following question:

I "inherited" the maintance of a certain program that is run weekly to generate some useage statistics. The original program used a qx/find `cat $html_dir/$good_file` -type d -name class -print | wc -l/ to count the instances of a directory named 'class' in all the subdirectories in a list.

Well now the directories can contain spaces in the names which drives the find command nuts, so I decided to rewrite that functionality in perl. After getting the list of directories to look though, I am trying to use File::Find to count for 'class' subdirs. Unfortunatly, this code is not decending into subdirectories correctly. When I run the program the counter only gets to 3. I know for a fact that there are more than 3 of these subdiretories.

Anyone have any ideas on things to try?

my $ltotal = 0; foreach my $dir (@good_dirs) { chomp($dir); unless ($dir eq "") { File::Find::find({wanted => \&wanted}, "/mnt/$dir"); } } sub wanted { my ($dev,$ino,$mode,$nlink,$uid,$gid); (($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) && -d _ && /^class\z/s && $ltotal++; }

Replies are listed 'Best First'.
Re: Problems using File::Find
by graff (Chancellor) on May 29, 2003 at 03:04 UTC
    The original cause of the problem was not so much with "find", but rather with `cat $html_dir/$good_file` -- when running a shell command this way, cat will always tokenize the shell will always tokenize the output from cat (or other backtick command) on any whitespace, treating all whitespace the same (yielding a separate argument string) without paying any attention to any sort of escape mechanism (quotes, backslash, etc).

    I don't want to discourage you from figuring out the proper way to use File::Find, but in my own experience, this module has normally been significantly slower in any task when compared to the standard "find" shell utility (and let's face it, File::Find's usage has a strange, counter-intuitive feel to it, which has tripped up a lot of folks).

    Given your situation, that suddenly the list in "$html_dir/$good_file" happens to contain items with internal spaces, I would be inclined to figure out how to keep using the shell "find" approach -- e.g.:

    open (LIST, "<$html_dir/$good_file"); my @paths = <LIST>; close LIST; chomp @paths; my $total = 0; for my $p (@paths) { my $count = qx/find "$p" -type d -name class | wc -l/; chomp $count; $total += $count; }
    Being able to put double quotes around the path argument in the find command line makes all the difference.

    You might think you spend a lot of overhead running this many subshells with "qx//", but I wouldn't be surprised if this still ends up running faster than the equivalent File::Find solution (once you get that working -- good luck with that).

    update: added "$total" to the code sample.

Re: Problems using File::Find
by TomDLux (Vicar) on May 29, 2003 at 02:30 UTC

    You don't have to follow perldoc File::Find too literally, you can make it look like an ordinary subroutine.

    You don't need your foreach, you can feed find() an array. I searched for something that exists on my system ...

    Is it possible you only have three directories named 'class'?

      I have counted at least 10 by hand, and there should be on the order of a few hundred.
Re: Problems using File::Find
by BrowserUk (Patriarch) on May 29, 2003 at 03:03 UTC

    This isn't a case thing by any chance? Try adding /i to your regex. At you have it, the regex won't match 'Class' or 'CLASS' or ...

    Also, not that it will affect what you are finding, but it would be better to do you -d check before calling lstat (**if you need to call it at all?), as currently you are needlessly running lstat on every file and directory in the tree(s).

    ** File::Find guarentees to have run stat on every file/dir it gives you so that you can use the magic _ in your -X tests. Unless you need the results of lstat, it is just wasted effort.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


Re: Problems using File::Find
by broquaint (Abbot) on May 29, 2003 at 09:35 UTC
    Sounds like a job for ... File::Find::Rule!
    use File::Find::Rule; my $cnt = scalar find( directory => name => 'class', in => @good_dirs, );
    That should roughly do what you want, but if not, tweak as you see fit. See. the File::Find::Rule docs for more info on this most marvellous of modules.
    HTH

    _________
    broquaint

Re: Problems using File::Find
by VSarkiss (Monsignor) on May 29, 2003 at 02:52 UTC

    The problem may be with symbolic links. You may be counting the same linked-to directory twice (easy to do when they're all named "class" ;-). Try setting the follow option to see if it makes a difference: find({wanted => \&wanted, follow => 1}, "/mnt/$dir");

Obligatory reference to File::Find guide
by data64 (Chaplain) on May 31, 2003 at 01:35 UTC

    Beginners guide to File::Find
    Also, see the excellent replies to the original post which give even more information.


    Just a tongue-tied, twisted, earth-bound misfit. -- Pink Floyd