*sigh* Yes, I agree. I created a customized version of Perlfect search for a client (adding PDF indexing, searching for content in different languages and parts of the website, and a couple other things) and the code is not nice to work with, to say the least.

And the worst thing is, working with awful code inspires you to write awful code yourself ;-)

Still, I have to admit it does work nicely.

<ramble>I've long been thinking of creating a modularized pure-perl search engine: Just choose some input modules (local fs, spider, ftp-spider, ...), preprocessors (PDF->text, summarizers, ...) and storage modules (MySQL, dbm, ...) and voilą: your customized search engine is ready. I doubt I will ever have the time, though...</ramble>


In reply to Re: Searching locally with Perlfect with Monkey Httpd by crenz
in thread Searching locally with Perlfect with Monkey Httpd by zentara

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.