Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Can someone tell me about the Linkbot that checks web links? Can it capture how many pages are on my web server?? I need to get how many web pages are on my NT web server. Should I do this using Perl instead of using LinkBot. Please advise.

Replies are listed 'Best First'.
Re: web page count
by Abigail-II (Bishop) on Jul 11, 2003 at 14:02 UTC
    find $DOCUMENTROOT -type f -name '*.html' | wc -l

    You might want to tweak this, depending on what you consider a webpage.

    Nothing fancy, not even perl, needed.

    Abigail

Re: web page count
by nysus (Parson) on Jul 11, 2003 at 14:03 UTC
    I can't name you the specific modules that would be able to do this but I can tell you it probably exists. Do a search on cpan.org, which is the worldwide repository for Perl modules. As to whether you should be using Linkbot, I don't know. Is it free?

    $PM = "Perl Monk's";
    $MCF = "Most Clueless Friar Abbot Bishop Pontiff";
    $nysus = $PM . $MCF;
    Click here if you love Perl Monks

      Linkbot comes with Cold Fusion studio and I do have it. But was wondering if it can count how many web pages we have on our server?
      I would assume as mentioned by Abigail I can write something in Perl to do the same thing. Do I need a specific module to do this or can I do it without a module?
        I've never used Cold Fusion nor Linkbot so I can't answer those questions.

        I think your question is what is easiest? With all the tools out there, there are 1,001 ways to do just about any web site administration task. You are not traversing roads less traveled. But what is best/easiest for you depends on your skill level. If you are very familiar with Perl, I'd point you to Lincoln Stein's script at http://stein.cshl.org/~lstein/talks/perl_conference/cute_tricks/mirror2.html and tell you that it could be pretty easily modified to count the number of .html documents in a tree directory on the Internet or on your hard drive. But if you are not that familiar with Perl, than you probably want to spend 30 min. to an hour trying to find a pre-existing module that has the precise functionality you are looking for. For instance, I just found http://search.cpan.org/author/AWRIGLEY/sitemapper-1.019/lib/WWW/Sitemap.pm. I don't know if it will do exactly what you want. However, there are dozens more modules that deal with web document administration but you need to search cpan.org to find them. And that's where you come in to play. Good luck.

        $PM = "Perl Monk's";
        $MCF = "Most Clueless Friar Abbot Bishop Pontiff";
        $nysus = $PM . $MCF;
        Click here if you love Perl Monks