Ro has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

This is what I am trying to do: Look for a string of characters //** in all .js files recursively in a directory, then print out the characters that come right after the string. I want to do this for each line that starts with the string mentioned above to a specified .html file.

My code works when grabbing one file, but looping through a directory is causing me issues. I am thinking I need File Find, but am stuck and don't know how to go about it.

If anyone could offer any help/ideas, that would allow me to do this while but with a bunch of files, that would be great.
My Code (That works for one specified file):
# i have all the correct headers use filefind, /opt/bin/perl bla bla $jscriptdir = "/home/javascript/"; $output = "/home/web/foo/index.htm"; $reading = "/home/javascript/jumper.js"; # Open file, match comment string, write to html open(JS,"<$reading") or die "Can't Open $reading: $!"; open(NEW,">$output") or die "Can't Open $output: $!"; print NEW "<html><title>Comment Extractor</title></html>\n"; print NEW "<body>\n"; print NEW "<h1>Comment Extractor</h1>\n"; print NEW "<hr>\n"; $script_title = rindex($reading, "/"); print NEW substr($reading, $script_title+1); while (<JS>) { if ($_ =~ /\*\*/) { print NEW substr($_,4) . "<br>"; } } print NEW "</body></html>\n"; close NEW; close JS;
  • Comment on Find, read, write out contents of a certain file type recursively...
  • Download Code

Replies are listed 'Best First'.
Re: Find, read, write out contents of a certain file type recursively...
by gjb (Vicar) on Nov 14, 2002 at 19:44 UTC

    If all .js files are in a single directory, File::Find isn't the only option.

    opendir(DIR, $jscriptdir) or die("Can't open $jscriptdir"); my @jsFiles = grep(/\.js$/, readdir(DIR)); closedir(DIR); foreach my $jsFile (@jsFiles) { # do whatever }
    If you really want to recurse through a directory hierarcy, you can use the code below:
    use File::Find; use IO::File; find(\&wanted, $jscriptdir); sub wanted { if (/\.js$/) { my $fh = new IO::File($_) or die("Can't open $File::Find::dir/ +$_"); while (<$fh>) { # your while here } $fh->close(); } }
    Hope this helps, -gjb-

Re: Find, read, write out contents of a certain file type recursively...
by jdporter (Paladin) on Nov 14, 2002 at 20:01 UTC
    Here's one way to do it.
    use File::Find; my $jscriptdir = "/home/javascript/"; my $pat = quotemeta '//**'; find( \&proc, $jscriptdir ); sub proc { /\.js$/ or return; my $n = $File::Find::name; open F, "< $n" or die "read $n: $!\n"; my @l = map { s/^$pat// ? $_ : () } <F>; close F; @l or return; $n =~ s/\Q$jscriptdir\E.//; print "$n $_" for @l; }
    Adding the HTML is left as an exercise. :-)

    jdporter
    ...porque es dificil estar guapo y blanco.

Re: Find, read, write out contents of a certain file type recursively...
by Ro (Initiate) on Nov 14, 2002 at 20:21 UTC
    Thanks for the help guys. I got this thing working. I will post my modified code. Since I am new, I will look around and see, if I can post the source so someone else can benefit. Thanks!
    # Variables $output = "/home/web/path/to/foo/foo.htm"; $jscriptdir = "/home/web/foo/javascript/"; $javadir = "/home/web/foo/servlet/"; $cgidir = "/home/web/cgi-bin/"; # Open the output file open(NEW,">$output") or die "Can't Open $output: $!"; print NEW "<html><head><title>Comment Extractor</title></head>\n"; print NEW "<body>\n"; print NEW "<h1>Comment Extractor</h1>\n"; print NEW "<hr>\n"; # Let's find files. Call jsfind for the dirty work find (\&jsfind, $jscriptdir); print NEW "</body></html>\n"; close NEW; exit; sub jsfind { # get all file names that have last 2 characters "js" if($File::Find::name=~/\.js$/) { # ignore js files in those stupid Frontpage _vti* directories if ($File::Find::name=~/\_vti/) {} else { # get name of script file and print it in red $script_title = $_; print NEW "<b><font style=\"color:red; size:16px; text-tra +nsform: uppercase\">" . $script_title . "</font></b>"; # placeholder for file # start opening files to read open(FILE, $File::Find::name) or die "could not open f +ile - $_ - : $!"; # iterate thru files to find match //** while (<FILE>) { if ($_ =~ /\*\*/) { # print files to $output print NEW substr($_,4) . "<br>"; } } close FILE; } } else { return; } }
      I will post my modified code

      Thanks! (Um, be sure to post the whole thing... -- I didn't see a line saying "use File::Find" when I read this post.)

      Just one concept you should consider: doing a recursive directory search for files of a given type (e.g. *.js) is a very common facility that is handy for a wide range of particular needs -- that's why the GNU "find" utility (and the decades-old unix "find" that it's modeled on) is such a basic, essential component on so many systems (it's been ported to windows, etc).

      Even if you want to stick with File::Find (which happens to be a few times slower than GNU "find"), you should consider making it a separate utility by itself, and keep just the editing function in a simpler app that works on a single file, or on a list of files read from stdin -- e.g.:

      my_find_utility '*.js' | my_comment_extractor
      This way, when you come up with some other particular edit or summarization process to be done on all files of a given type in a directory tree, you don't have to re-write the part that recurses through the directories. Just write a script that will apply the new process to any list of file names on stdin, and use the same front-end program (or GNU "find") to feed it.
Re: Find, read, write out contents of a certain file type recursively...
by dingus (Friar) on Nov 15, 2002 at 09:20 UTC
    This is my way of doing the recursive look for all files in a tree and process those that match a pattern. I'm using DosGlob because I do this on Windows PCs. I'm pretty sure that using regular glob works too but haven't tried it.
    use File::DosGlob; $dir= '/home/javascript/*'; # note trailing * !!! my @m = File::DosGlob::doglob(1,$dir); for (@m) { push @m, File::DosGlob::doglob(1,$_.'/*') if -d($_); next unless (/\.js$/); # put your code here to # open file ($_) and search open(JS,"<$_") or die "Can't Open $_: $!"; # etc. }
    PS you probably want to open your output file outside the search LOOP!

    Dingus


    Enter any 47-digit prime number to continue.