My script searches a copy of the original text that has had all non alphabet characters converted to spaces (except ' in words like isn't, which just disappears), all sections of spaces reduced to one space, and all letters converted to lowercase. A space has also been added to the start and end of each section. The search terms are then converted using this method too, and searches are performed for exact phrase, all keywords, and any keywords - depending of course on the number of search results at each level, and the number of keywords submitted.
This method seems to work well unless you want to return the section of text matched, in which case the copy can't be resized by having characters removed. You should certainly have a copy, though, since exact match searches are much faster than case-insensitive searches, and all text to be searched should be in one file rather than many.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.