Re: Tutorial on File::Find even more basic than "Beginners Guide"

a first quickshot. this can be finetuned to better fit your needs:

#always
use strict;

#load modules
use File::Find;
require HTML::LinkExtor;

#create a HTML::LinkExtor-instance for later use
my $links = HTML::LinkExtor->new
(
    # first argument is a subroutine that will
    # be called for every link in the html
    # the object parses
    sub
    {
        # $tag can contain "a" or "img"
        # %links contains the "attributes" of the link
             my ($tag, %links) = @_;
             
             #print if we have a "a"-link that is not
             #page internal (no "#")
             print "$links{href}\n"
                 if $tag eq "a" && $links{href} =~ /^[^#]/ ;
     }
 );

#find all html-files in a tree
find
(
    #first argument is the sub that will be called
    #for every file AND directory found
    sub
    {
        # check if we have file that has htm or html-suffix
        if ( -f $File::Find::name && if $File::Find::name =~ /\.htm(l)
+?/ )
        {
            #if so, parse it for links
            print "$File::Find::name contains:\n";
            $links->parse_file($File::Find::name);
        }
    }
    , "c:/perl"
);
[download]

Learn by examining code. You should change "c:/perl" to the path you need.

p.s. what is wrong with the docs of "File::Find"? They belong to the better ones.

Update:
Added comments

holli, regexed monk

Comment on Re: Tutorial on File::Find even more basic than "Beginners Guide" Download Code

Replies are listed 'Best First'.
Re^2: Tutorial on File::Find even more basic than "Beginners Guide" by ww (Archbishop) on Jan 20, 2005 at 23:24 UTC
Holli: suspect the issue is NOT with the docs, but rather with this noob's overreaching. For example, both pod and the ref'ed tutorial seem (to me) to say, given a start_dir from the cli, F:F iterate thru all subdirs, id'ing those whose name matches a regex in my processing sub. But stepping thru my code with -d tells me I've misunderstood something. Comes back to skill level of this reader. For the rest, thank you very much. I will study and understand... soon, I hope. <G>	[reply]

Replies are listed 'Best First'.

Re^2: Tutorial on File::Find even more basic than "Beginners Guide"
by ww (Archbishop) on Jan 20, 2005 at 23:24 UTC

suspect the issue is NOT with the docs, but rather with this noob's overreaching. For example, both pod and the ref'ed tutorial seem (to me) to say, given a start_dir from the cli, F:F iterate thru all subdirs, id'ing those whose name matches a regex in my processing sub. But stepping thru my code with -d tells me I've misunderstood something. Comes back to skill level of this reader.

For the rest, thank you very much. I will study and understand... soon, I hope. <G>

[reply]