in reply to Why does File::Find chdir?

It's more efficient to chdir into a directory and then name the files within that directory without having to specify a multiple step path. You can probably even see the difference in a carefully constructed benchmark.

Hence, using chdir is the default. It's the most efficient.

-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply.

Replies are listed 'Best First'.
Re^2: Why does File::Find chdir?
by hossman (Prior) on Jul 13, 2004 at 06:14 UTC

    You lost me there, when might it be less efficient to "specify a multiple step path" ?

    I don't really use File::Find much, so I super searched for some example code to try benchmarking, and from what i can tell, all other factors being equal/irrelevant, no_chdir seems to be faster.

    Here's an example from one of your snippets...

    #!/usr/local/bin/perl use Benchmark qw(cmpthese); use File::Find; my %results; my $wanted = sub { if (-l) { # it's a symlink my ($dev, $ino) = lstat _; # reuse info from -l push @{$results{"$dev $ino"}}, $File::Find::name; if (-e) { # that points somewhere else my ($dev, $ino) = stat _; # reuse info from -e push @{$results{"$dev $ino"}}, "symlink:$File::Find::name"; } } else { my ($dev, $ino) = stat; push @{$results{"$dev $ino"}}, $File::Find::name; } }; my @dirs = qw(/bin /usr/bin /usr/sbin); # change this to "/" to do the cmpthese(1000, { chdir => sub { %results=(); find { wanted=>$wanted }, @dirs; }, no_chdir => sub { %results=(); find { wanted=>$wanted, no_chdir=>1}, @dirs; } }); __END__ laptop:~> monk.pl Benchmark: timing 1000 iterations of chdir, no_chdir... chdir: 79 wallclock secs (61.68 usr 13.90 sys + 1.23 cusr 1.03 +csys = 77.84 CPU) @ 13.23/s (n=1000) no_chdir: 82 wallclock secs (61.94 usr 17.99 sys + 1.07 cusr 1.09 +csys = 82.09 CPU) @ 12.51/s (n=1000) Rate no_chdir chdir no_chdir 12.5/s -- -5% chdir 13.2/s 6% --
      That's not very deep, to go to /usr/bin/foo instead of using chdir first. What I meant was a deep hierarchy, like /usr/local/lib/X11/app_defaults/foo. Every path that has all those steps has to be looked up step-by-step, repeating the same work for the kernel over and over.

      Admittedly, the cost is fairly cheap these days, since modern kernels can cache a lot of the intermediate directories. But it's still a non-zero cost, and while that might not make a difference for a dozen lookups, it will for a thousand lookups.

      -- Randal L. Schwartz, Perl hacker
      Be sure to read my standard disclaimer if this is a reply.

        Ah, yeah .. good point.

        For those who want to prove this to themselves, here's the script I used to convince myself...

        And the result...

        laptop:~/tmp/monk-test> ~/monk.pl -50 laptop:~/tmp/monk-test> ~/monk.pl 100 Benchmark: timing 100 iterations of chdir, no_chdir... chdir: 2 wallclock secs ( 1.39 usr 0.40 sys + 0.12 cusr 0.10 +csys = 2.01 CPU) @ 55.87/s (n=100) no_chdir: 3 wallclock secs ( 1.50 usr 1.08 sys + 0.09 cusr 0.17 +csys = 2.84 CPU) @ 38.76/s (n=100) Rate no_chdir chdir no_chdir 38.8/s -- -31% chdir 55.9/s 44% -- laptop:~/tmp/monk-test> ls 0 1 a aa laptop:~/tmp/monk-test> rm -rf * laptop:~/tmp/monk-test> ~/monk.pl -100 laptop:~/tmp/monk-test> ~/monk.pl 100 Benchmark: timing 100 iterations of chdir, no_chdir... chdir: 4 wallclock secs ( 2.98 usr 0.71 sys + 0.13 cusr 0.12 +csys = 3.94 CPU) @ 27.10/s (n=100) no_chdir: 8 wallclock secs ( 3.03 usr 3.98 sys + 0.13 cusr 0.21 +csys = 7.35 CPU) @ 14.27/s (n=100) Rate no_chdir chdir no_chdir 14.3/s -- -47% chdir 27.1/s 90% --