in reply to Re: RadicalMatterDotCom
in thread RadicalMatterDotCom

A spider does not need to be a registered user, so there are only the nodes, as you skip the part of "lastnode_id". the bigger problem is the duplication of information as the display of a node also includes replies. So whether the spider recognizes single nodes and can assort them to avoid multiple lookups and storing everything n times, if n is the level of "re:" to a node, it could work.
Anyhow, for archiving purposes, and to make perlmonks more easy searcheable and to allow for better categorization it might be the need on the side of perlmonks.org to have something alike:
http://www.perlmonks.org/index.pl?node_id=83485&view_mode=plain_txt
so the spider does not mess up with all the dynamic content as nodelets and menubars.
And I bet that the everything engine has such a feature, even if its somewhere deep hidden and only used for debugging or so.
But, sincere excuse, as long as this loads work to vroom, better write a really good spider.
Yes, I like your idea a lot, cause I believe, all thats been posted until now, would make up a Perl-monks-bookshelf upon tips, traps, tricks and so on. (well except meditations and discussions, but those contents you could offer mindspring.com or pilosophy.org for linking.) {grin}

Have a nice day
All decision is left to your taste

Replies are listed 'Best First'.
Re: Re: Re: RadicalMatterDotCom
by blakem (Monsignor) on Aug 23, 2001 at 02:53 UTC
    Seems like you've proved my point though. Any spider that successfully indexes perlmonks will have to be specially customized for the site. Or, said another way, perlmonks is not friendly to the general search spider.

    Oh, and I think you might be looking for DisplayType Raw

    -Blake

      Well, at this point I agree. :-)
      BUT, if you register:
      http/www.perlmonks.org/index.html
      a document that does not exist yet, and create an alias for that URI in the apache config, so you can dynamically generate a page only containing all keywords as you can see them in one of the nodelets to each node, make these keywords links to search or super search and you have ONE document listed at google, and from there on the user can search perlmonks, after he'd seen that it's possibly here what he is looking for and without knowing that he will initialize a search when clicking one of those links, but apparently the spider will notice that the link targets to script and stop proceeding. More than that is needed?
      Well, I don't think so :-)
      And ok, that would make it more userfriendly then it is currently now. :-)

      Have a nice day
      All decision is left to your taste