web link errors

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Please advise why I keep getting these errors when using this WEB LINK CHECK script:

     
use strict;

use LWP::Simple;
use HTML::TokeParser;
use HTML::Entities;


my @newspages = qw(
http://osis.nima.mil
http://osis.nima.mil/myhot.html
http://osis.nima.mil/myoffices.html
http://osis.nima.mil/mytraining.html
http://osis.nima.mil/mygeospatial.html
);
for (@newspages) {   
my $html = $_;
my ($junk,$short) = split(/\./,$html); # get domain name
my $body .= "<td valign=top>$short<br>";
my $get = get("$html");
my $p = HTML::TokeParser->new(\$get);
while (my $token = $p->get_tag("a")) {
my $url = $token->[1]{href} || "-";
my $text = $p->get_trimmed_text("/a");
unless ($url =~ /^mailto|^javascript/){  # don't grab javascrpt or mai
+lto's 
$body .= "<a href=\"$url\" target=\"new\">$text</a><br>\n"; }
        
                                    
} $body .= "</td>"
}
my $body .= "</tr></table>";

        
        
        
open(OUT,">news.txt"); # send to an html file
print OUT "$body";
[download]

MY error messages on my NT workstation:

Use of uninitialized value in substr at C:/Perl/site/lib/HTML/PullPars
+er.pm line
 82.
Use of uninitialized value in length at C:/Perl/site/lib/HTML/PullPars
+er.pm line
 85.
Use of uninitialized value in substr at C:/Perl/site/lib/HTML/PullPars
+er.pm line
 82.
Use of uninitialized value in length at C:/Perl/site/lib/HTML/PullPars
+er.pm line
 85.
[download]

Comment on web link errors Select or Download Code

Replies are listed 'Best First'.
Re: web link errors by Stegalex (Chaplain) on Mar 12, 2002 at 14:05 UTC
The W3C has a public domain Perl script that will check your site for dead links. Why not use it instead? It's here. I like chicken.	[reply]
Re: web link errors by silent11 (Vicar) on Mar 12, 2002 at 13:59 UTC
This looks like a mod of some code I posted here. On the surface, I don't see anything wrong with your code, except for the fact that those URL's don't exist (at least not at the momnet for me). Do you have all the modules installed? `LWP::Simple; HTML::TokeParser; HTML::Entities;` [download] Also, when I posted this code, I was running the script against domains w/o subdomains as you have here. I split on /\./ to get the domain, you will only get the subdomian. -Silent11	[reply] [d/l]
Re: Re: web link errors by Anonymous Monk on Mar 12, 2002 at 14:23 UTC
thanks for your reply! Here are my ppm listings: `Archive-Tar Compress-Zlib Digest-MD5 File-CounterFil Font-AFM HTML-Parser HTML-Tagset HTML-Tree MIME-Base64 PPM SOAP-Lite Storable Tk URI XML-Parser XML-Simple libnet libwin32 libwww-perl` [download] I thought some were the same as what you had listed or close to it? Also should I list ALL my links in my HTML page in the newspages array?? I like your script because it is not long and complex for a beginner like me so I can learn off it.	[reply] [d/l]