Fuism has asked for the wisdom of the Perl Monks concerning the following question:

What would be the best way to search for files from sub directries within sub directories? Im writing a script to find "*.sas" files and open them to search for keywords(@words). Some home directories have a variety of number of sub directories within each sub directories. Hope Im not confusing anyone, if I am please let me know. Below is the code Im using which will only look for a sub DIR within a DIR. How would I make it so that it knows to go into more sub directories if they exist? Oh and I cant use File:Find because on running it on the server and dont have SA permissions! Thanks in advance for all your help!!! Fue
#!/usr/bin/perl # Open Directory to scan die("Cannot open /cdw/home_dir/.") unless(opendir(DIR, "/cdw/home_dir/")); # Open file to print files die("Connot open /cdw/home_dir/s006258/CHECK_SAS/cant_open_file.txt.") unless(open(OUTFILE, ">/cdw/home_dir/s006258/CHECK_SAS/cant_open_f +ile.txt")); # Open file to print words that exists in file die("Connot open /cdw/home_dir/s006258/CHECK_SAS/word_exist_list.txt." +) unless(open(EXIST, ">/cdw/home_dir/s006258/CHECK_SAS/word_exist_li +st.txt")); # List of words to search for @words = ("\$BIDI", "\%ARMCONV", "\%CENTROID" , "\%MAPLABEL" , "ALLPER +M", "ANYALNUM", "ANYALPHA", "ANYCNTRL", "ANYDIGIT", "ANYFIRST", "ANYGRAPH", "ANYLOWER", "ANYNAME", "ANYPRINT", "ANYPUNCT", + "ANYSPACE", "ANYUPPER", "ANYXDIGIT", "ARMAGENT", "ARMLOC", "ARMSUBSYS", "BETA", "BOUNDS", "BYSORTED", "CAT", "CATS", "C +ATT", "CATX", "CEILZ", "CHTML", "CLONE", "COLNAME", "COLUMNS", "COMPARE", "COMPAREREG1", "COMPCOST", "COMPGED", "COMPLEV", "COUNTC", +"CPUCOUNT", "CSVALL", "CUMIPMT", "CUMPRINC" , "DATECOPY", "DISTANCE", "DMSSYNCHK", "DOCBOOK", "DTRESET", "ECTREND", "EMAILAUTHPR +OTOCOL", "EMAILID", "EMAILPW", "ENCODING", "ENTROPY", "ERRORBYABEND", "FINDC", "FLOORZ", "FONTSLOC", "FORCE", "FRONTREF", "G +AREABAR", "GBARLINE", "HELPENCMD", "HTMLCSS", "IBUFSIZE", "IMODE", "INTCINDEX" , "INTCYCLE" , "INTFMT" , "INTINDEX" , "INTSEA" , + "INTZ", "IPMT" , "IQR", "ITPRINT", "KCVT", "LENGTHC", "LENGTHM", "LENGTHN", "LISTREG", "LOGBETA", "LOGPARM", "MAD", "MAPIMPORT" , "MARK +UP", "MAXITER", "METAID", "METAPASS", "METAPORT", "METAPROTOCOL", "METAREPOSITORY", "METASERVER", "METAUSER", "MODZ", "NLITERAL", "NOPRO +MAXNORM", "NOTALNUM", "NOTALPHA", "NOTCNTRL", "NOTDIGIT", "NOTFIRST", "NOTGRAPH", "NOTHREADS", "NOTLOWER", "NOTNAME", "NOTOP", " +NOTPRINT", "NOTPUNCT", "NOTSPACE", "NOUPPER", "OUTLIER", "PAGEBREAKINITIAL", "PCTL", "PMT" , "PPMT" , "PRINTERR", "PRXCHANGE", +"PRXDEBUG", "PRXFREE", "PRXMATCH", "PRXNEXT", "PRXPAREN", "PRXPARSE", "PRXPOSN", "PRXSUBSTR", "PUTLOG", "QUOTELENMAX", "RANPERK" +, "RANPERM", "ROBUSTREG", "ROLE", "ROUNDE", "ROUNDZ", "SCANQ", "SMM", "SORTEQUALS", "SORTSEQ", "SORTSIZE", "STDIZE", "STREAMINIT", "S +URVEYFREQ", "SURVIVAL", "SWFONTRENDER", "SYLK", "TAGSET", "TERMSTMT", "TEXTURELOC", "THREADS", "TOL", "TOOLSMENU", "UCM", "USERI +NPUT" , "UTILLOC", "UTOMDL", "V6CREATEUPDATE", "VALIDFMTNAME", "VIEWM +ENU", "VNEXT", "VVALUE", "VVALUEX", "WML"); print "@words"; # Reads in each Directory for my $dir (readdir(DIR)) { #print "$dir\n"; if ($dir eq "s006258") { # Open each Directory to scan for sas files print("Cannot open /cdw/home_dir/$dir \n") unless(opendir(HOME, "/cdw/home_dir/$dir")); for my $file (readdir(HOME)) { #Scans for .sas files &sas_scan() if ($file =~ /.sas$/) { print "$file SAS\n"; #Prints .sas files that cannot be opened into OUTFILE print OUTFILE ("Cannot open the file /cdw/home_dir/$di +r/$file \n") unless(open(INFILE, "/cdw/home_dir/$dir/$file")); while ($read = <INFILE>) { for ($i = 0; $i < @words; $i++) { if ($read =~ /$word[$i]/) { print EXIST "The $word[$i] exists in /cdw/ +home_dir/$dir/$file \n"; print EXIST "on line: $read\n\n"; } } } } else { # Open each Directory to scan for sas files print OUTFILE ("Cannot open dir /cdw/home_dir/$file \n +") unless(opendir(DIR, "/cdw/home_dir/$file")); for my $file (readdir(HOME)) { #Scans for .sas files &sas_scan() if ($file =~ /.sas$/) { print "$file SAS\n"; #Prints .sas files that cannot be opened i +nto OUTFILE print OUTFILE ("Cannot open the file /cdw/ +home_dir/$dir/$file \n") unless(open(INFILE, "/cdw/home_dir/$di +r/$file")); while ($read = <INFILE>) { for ($i = 0; $i < @words; $i++) { if ($read =~ /$word[$i]/) { print EXIST "The $word[$i] exi +sts in /cdw/home_dir/$dir/$file \n"; print EXIST "on line: $read\n\ +n"; } } } } } } } } }

Replies are listed 'Best First'.
Re: Searching Sub Dir for Files
by Fletch (Bishop) on Jun 01, 2005 at 18:56 UTC
Re: Searching Sub Dir for Files
by davidrw (Prior) on Jun 01, 2005 at 18:56 UTC
    You can use File::Find to do the traversing for you -- it will do exactly what you want (recursively find *.sas files and do something to them).
    As to modifying your code, you would need to toss what you have into a function, and then call itself (recursion) on any subdirs found.. This is all the work that File::Find will do automagically for you.
      File:Find is a module. Im running this on a Unix server and do not have SA permissions and do not want to install any modules on the server. I basically have 2 options 1) Install the module on my pc and run it from my pc to do this (Would this work if I have to telnet into the Unix box)? 2) Write my script to do what File:Find would do? Any suggestions? Thanks again everyone... Fue
        Is installing it locally (like in /home/yourname/local) an option?
        1) is not an option--if it's running on the server, it can't read your modules from your pc.
        2) you can, but it will take time, and not be as robust as the hardened and proven File::Find module.

        You can always look at File::Find's source, but i would strongly recommend using the module itself (are you sure it's not installed already?). If you're going to write it yourself, i'm sure you can find a recursion example of doing just this...

        A workaround would be to use unix tools .. construct a commandline w/somthing like (i'm sure you'll want to tweak the egrep settings):
        my $cmd = 'find /some/path -name \*.sas | egrep -l ' . "'(" . join(' +|',@words) . ")'";
Re: Searching Sub Dir for Files
by cmeyer (Pilgrim) on Jun 01, 2005 at 19:04 UTC
    It sounds like you are asking for a recursive grep. Fortunately the gnu egrep already has a recursive switch (-r -- see the manpage). If you really wish to do this from Perl, then you'll find the module File::Find useful. For example:
    use File::Find; find(\&wanted, @directories_to_search); # untested sub wanted { return unless -f $File::Find::name; my $f; unless ( open $f, '<', $File::Find::name ) { warn "couldn't open $File::Find::name: $!\n"; return; } my $matched; while (<$f>) { $matched = 1 if /word_to_look_for/; } print "found word_to_look_for in $File::Find::name\n" if $matched; }
Re: Searching Sub Dir for Files
by crashtest (Curate) on Jun 01, 2005 at 19:19 UTC
    If you're on a *NIX system (and it looks like you are), I like using the find program to do the heavy lifting of locating all the files for you. It's pretty powerful and will handle additional complexity if your requirements ever expand beyond "files that end in .sas". You then feed the output of find to your Perl program, letting it concentrate on the actual searching.
    find /cdw/home_dir/ -name "*.sas" | xargs perl yourprog.pl
    ... where yourprog.pl looks something like:
    use strict; use warnings; my @words = qw(foo bar baz etc.); while (<>){ for my $word (@words){ print "Found $word in $ARGV\n" if (/\Q$word/); } }
      Nice... I will try this... I will give u and everyone who helped out on this thread votes... Thanks all for all ur help... Fue
Re: Searching Sub Dir for Files
by cool_jr256 (Acolyte) on Jun 01, 2005 at 19:01 UTC
    When you run through the results from "opendir" you can perform this:
    lets say your path of the directory you are searching is:
    /home/user/searchdir
    then you can use the following "if" statement in your loop:
    if(-d "/home/user/searchdir/$dir")

    That statement will tell you whether its a directory. Hope that helps.....
      Cool, that will help alot. I would still need to know how you would write something similar to File:Find if you cant use the module. Im guessing this wouldnt be something that is fairly easy to write?!? Thanks again all... Fue