Speedy directory searching

ironpaw has asked for the wisdom of the Perl Monks concerning the following question:

Lo Monks I have written (in my usual terrible perl) a script to search for certain files and append some entries to the end. It works fine but it is slower than windows 98 with 10,000 temp files. Any ideas on speeding it up?? The purpose is to add some TNSNAMES entries to our oracle client systems.

NAMES HAVE BEEN CHANGED TO PROTECT THE INNOCENT

Thanks, Ironpaw

use File::Find;            

$feeddir     = "c:/";

chdir("$feeddir");        
find(\&wanted, "$feeddir");    

sub wanted     
{                
        /\TNSNAMES.ORA/i or return;        
    $filename = $File::Find::name;        
    &Details($filename);            
}                        

sub Details 
{
$file = shift;

open(CSV, ">>$file");

print "$file\n";

print CSV "\n";
print CSV "SERVER01.dbdev1=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOC
+OL=TCP)(HOST=SERVER01)(PORT=1527)) )(CONNECT_DATA=(SID=dbDEV1)))\n";
print CSV "SERVER01.dbdev2=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOC
+OL=TCP)(HOST=SERVER01)(PORT=1527)) )(CONNECT_DATA=(SID=dbDEV2)))\n";
print CSV "SERVER01.dbintst=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTO
+COL=TCP)(HOST=SERVER01)(PORT=1527)) )(CONNECT_DATA=(SID=dbINTST)))\n"
+;
print CSV "SERVER01.dbmstr=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOC
+OL=TCP)(HOST=SERVER01)(PORT=1527)) )(CONNECT_DATA=(SID=dbmstr)))\n";
[download]

Names have been changed to protect the innocent Thanks Ironpaw

Comment on Speedy directory searching Download Code

Replies are listed 'Best First'.
Re: Speedy directory searching by BrowserUk (Patriarch) on Jul 15, 2003 at 05:24 UTC
Is this running locally, or across a network? You show "c:/" but you also say that "names were changed". It's a trivial point in this case, but why do you assign `$filename = $File::Find::name;` before passing it to `&Details($filename);`? If you really need to speed this up, there are probably things that can be done, but you'd need to supply a little more information. Just how slow is it? How many files are be checked? How many match your criteria and are being appended to? </li. How long is this taking? Which OS? What filesystem type? What version of perl? Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller	[reply] [d/l] [select]
Re: Re: Speedy directory searching by ironpaw (Novice) on Jul 17, 2003 at 02:06 UTC
Answer. 65,000 files in 12,000 directories. Takes 2 1/2 mins reduced to 2mins 10sec by better calling of subs as suggested by one of the replies. OS is Win2000, NTFS, Perl is version 5.005_02. Windows takes about 2 mins to seach all files with indexing on so I am guessing at so many files this is as fast as can expect. I was wondering if there was something obvious I was missing, there is lots of files take time to search... Thanks Paw	[reply]
Re: Speedy directory searching by bobn (Chaplain) on Jul 15, 2003 at 04:48 UTC
The '\' in your regular expression does nothing "twice" because 1) it's invisible because it's not escaped with another '\' and 2) $_ will contain only the filename, not any of the separtors. Your print can be done with: print CSV " SERVER01.dbdev1=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOS +T=SERVER01)(PORT=1527)))(CONNECT_DATA=(SID=dbDEV1))) SERVER01.dbdev2=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOS +T=SERVER01)(PORT=1527)))(CONNECT_DATA=(SID=dbDEV2))) SERVER01.dbintst=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HO +ST=SERVER01)(PORT=1527)))(CONNECT_DATA=(SID=dbINTST))) SERVER01.dbmstr=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOS +T=SERVER01)(PORT=1527)))(CONNECT_DATA=(SID=dbmstr))) "; [download] Probably. This would deiniftely be OK on *nix, but with the odd newline sequence in Win32, it may not act as expected... But I really don't know why it's so slow. Maybe you should count how many times wanted() gets called - perhaps you just have a god-waful numbe rof files on your system. --Bob Niederman, http://bob-n.com	[reply] [d/l]
Re: Re: Speedy directory searching by waswas-fng (Curate) on Jul 15, 2003 at 05:23 UTC
it is slower on win98 for many reasons with one of the largest being that fat32 is slow and does not get cached well when walking the dir tree. also use strict, hereto's and don't use & to call a sub (unles you have good reason) for example your wanted sub can look like this: `sub wanted { /TNSNAMES.ORA/i or return; Details($File::Find::name); }` [download] -Waswas	[reply] [d/l]
Re: Re: Re: Speedy directory searching by ironpaw (Novice) on Jul 16, 2003 at 04:09 UTC
I'm using 2k NTFS locally. The better use of subs took the time from 2mins 30secs to 2mins 10secs which is well worth while (I will remember this so thanks). use strick is confusing and seems only to help my badly written working scripts to not work. I have yet to see an explenation of use strick that makes sense to me (I'll read it again now I have a little better understanding in general so thanks for the reminder). I know I should use it I don't know how or why so I'll RTFM and try to remember ;) Thanks again, the sub stuff is great.	[reply]
Re: Re: Speedy directory searching by ironpaw (Novice) on Jul 16, 2003 at 03:51 UTC
Yes the regular expression has a useless \ (legacy from using bits of other scripts and not cleaning them up). Your Print suggesting is much far better, thanks (I'll remember that). It does work on my Win2000 (perl 5.005_02 built for MSWin32-x86-object). It took about 2mins 30 seconds and the system has 66,150 files in 7,850 directories so I guess that is pretty reasonable (windows takes 2 mins to search for all files and it has the indexing service). My only option would be to limit the places it searches I guess. Thanks	[reply]
Re: Speedy directory searching by waswas-fng (Curate) on Jul 15, 2003 at 04:40 UTC
Your sub is missing a close(CSV); and } is this a bad paste? -Waswas	[reply]
Re: Re: Speedy directory searching by ironpaw (Novice) on Jul 16, 2003 at 03:37 UTC
Yes it was a bad paste. The list of entries carried on quite a bit an I deleted the end of the file... opps.	[reply]