I have written a fairly sweet little site robot script that I wanted to use to build a site content index. This way I could do searches on my own file and set it up anyway I wanted without having to pay a search engine company to do it for me and without having to put up stupid little banners. Additionally, I have the time to do it ;-)

Anyways, the script works ok but I want to expand upon it. It's been awhile since I wrote it and I'm a bit fuzzy with my Perl.

I want to allow the user to specify what tags within the document they want to search and store to the content index db. I'm looking for a way to do 2 things: search and retrieve the text between the open and close tag of any user specified type (this includes XML tags) and a nice little loop to go through all the user specified ones.

I figured I would let the user set the fields and order in the following order:

...indexer.pl?tag1=title&tag2=author&tag3=body

tag0 is *always* the complete URL to the page.

This way anybody could create their own content index to run searches on. Let me know if somebody else has made one and I'll stop now otherwise let me know if I've got all my marbles in one bag. Seems to me like somebody must have built one before as this seems like a good idea to me.

TIA

======================
Sean Shrum
http://www.shrum.net


In reply to Help need with code loop by S_Shrum

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.