Re: Execution hangs on File::Find
by Eily (Monsignor) on Dec 04, 2017 at 11:34 UTC
|
It sounds like execution just takes a while, probably because there are a lot of files on your mapped drive, and access is slower than a local drive (I'm guessing this is a network directory?). Maybe you can log progress (eg: report on current position every 100 file?) to show that something is still happening
One obvious optimization is to only traverse the tree once. And since this would make the "wanted" function a little more complex, it's easier to read if you don't define it directly in the call to find:
{
my $lastepch = time;
my @files;
my @todel;
my $counter;
sub wanted
{
$logger->info("Reached $File::Find::name") if ($counter++ % 100) =
+= 0;
push @files,$File::Find::name if (/\.*$/ and -f $File::Find::name
+and stat($File::Find::name)->mtime > $lastepoch);
push @todel,$File::Find::name if (-d $File::Find::name); }, $indir
+;
}
sub Get_FileList
{
@files = ();
@todel = ();
find (\&wanted, $_[0]);
$lastepch = time;
...
return (\@files, \@todel);
}
}
my ($files, $todel) = Get_FileList($indir);
I've also put the test on the file name first, it's a guess that it might be a little faster, because the name is already in memory and doesn't require drive/network access. And I'm not sure but it kind of looks like you have global variables, so I've shown you how you can have variables shared between functions without making them available everywhere. | [reply] [d/l] |
|
|
| [reply] |
|
|
sub wanted
{
$logger->debug("Reached $File::Find::name");
$logger->info(Carp::longmess("TRACE2"));
if (-f $File::Find::name)
{
$logger->debug("This is a file: checking time");
stat($File::Find::name)->mtime > $lastepoch or return;
push @files,$File::Find::name;
}
else
{
$logger->debug("Not a file");
if (-d $File::Find::name)
{
$logger->debug("This is a directory");
push @todel,$File::Find::name;
}
else
{
$logger->debug("Not a directory");
}
}
$logger->debug("Exiting");
}
BTW, /\.*$/ means that the name must have 0 or more dots at the end. This is true for all strings so this test is useless. | [reply] [d/l] [select] |
Re: Execution hangs on File::Find
by jahero (Pilgrim) on Dec 04, 2017 at 11:51 UTC
|
I have encountered similar behaviour when iterating over rather large directory tree over network, and I believe this *might* help a bit... Although it is hard to say, if your problem is the same, as mine was.
sub Get_FileList{
@files=();
@todel=();
####### WORKAROUND START
my $sloppy = ${^WIN32_SLOPPY_STAT};
${^WIN32_SLOPPY_STAT} = 1;
####### WORKAROUND END
find(sub {push @files,$File::Find::name if (-f $File::Find::name a
+nd /\.*$/ and stat($File::Find::name)->mtime > $lastepoch);}, $indir)
+;
find(sub {push @todel,$File::Find::name if (-d $File::Find::name);
+ }, $indir);
$lastepoch = time; #Update last execution time for the next run
$logger->info("New execution time update: $lastepoch.");
$proccount = scalar(@files);
$logger->info("Found $proccount new files since last run.");
####### REVERT WORKAROUND
${^WIN32_SLOPPY_STAT} = $sloppy;
####### REVERT WORKAROUND END
}
See ${^WIN32_SLOPPY_STAT} | [reply] [d/l] |
|
|
++ed the workaround, but I'm wondering whether it's possible to use local ${^WIN32_SLOPPY_STAT} = 1; instead of saving/restoring the value manually?
| [reply] |
|
|
It should be possible (might test it in my code later on).
| [reply] |
|
|
| [reply] |
|
|
Then I guess next step should be testing yout code over smaller directory tree, see if it hangs, and/or where it did spend most of the time.
I find it hard to believe that File::Find would freeze, although can hardly backup this claim with concrete evidence, so I would probably profile the run (Devel::NYTProf) on something small first.
Good luck, perhaps someone with more knowledge then humble me shall help.
| [reply] |
|
|
|
|
|
|
|
|
|
What does "dir maped" return from cmd
Exe?
| [reply] |
Re: Execution hangs on File::Find
by Dallaylaen (Chaplain) on Dec 04, 2017 at 11:26 UTC
|
use Carp;
$SIG{ALRM} = sub { Carp::cluck("HERE") };
alarm 10;
# ... now proceed with files
This would print a stacktrace after 10 seconds of execution, providing pointers about where the code is.
Alternatively you can bind to $SIG{INT} and press Ctrl-C to see where you are.
Or you can use $logger->info(Carp::longmess("HERE")); instead or cluck.
Also I'm wondering whether the script has a use strict; in it. Because if it doesn't, debugging becomes twice as hard. | [reply] [d/l] |
|
|
sub Get_FileList{
@files=();
@todel=();
$logger->info(Carp::longmess("TRACE1"));
find (sub {
$logger->debug("Reached $File::Find::name");
$logger->info(Carp::longmess("TRACE2"));
push @files,$File::Find::name if (-f $File::Find::name and
+ /\.*$/ and stat($File::Find::name)->mtime > $lastepoch);
push @todel,$File::Find::name if (-d $File::Find::name);},
+ $indir);
$logger->info(Carp::longmess("TRACE3"));
$lastepoch = time; #Update last execution time for the next run
$logger->info("New execution time update: $lastepoch.");
$proccount = scalar(@files);
$logger->info("Found $proccount new files since last run.");
}
logging below. i guess it is really getting stuck when it tries to traverse the mapped drive
04-12-2017 04:06:56:570 INFO TRACE1 at perl.pl line 85.
04-12-2017 04:06:56:577 DEBUG Reached Z:/
04-12-2017 04:06:56:577 INFO TRACE2 at C:/Strawberry/perl/lib/File/F
+ind.pm line 358.
File::Find::_find_dir(HASH(0x5174bd8), "Z:/", 1) called at C:/Stra
+wberry/perl/lib/File/Find.pm line 236
File::Find::_find_opt(HASH(0x5174bd8), "Z:/") called at C:/Strawbe
+rry/perl/lib/File/Find.pm line 760
File::Find::find(CODE(0x4e8b7b8), "Z:/") called at perl.pl line 11
+8
main::Get_FileList() called at perl.pl line 85
| [reply] [d/l] [select] |
|
|
| [reply] |
Re: Execution hangs on File::Find
by colox (Sexton) on Dec 04, 2017 at 15:34 UTC
|
dear monks,
as a follow-up to this, it indicates that the overhead is caused by file::find recursive search.
im wondering if there is also recursive readdir. if so, can help to share the syntax? i would like to do comparative study in the performance.
| [reply] |
| A reply falls below the community's threshold of quality. You may see it by logging in. |