Nice++
Again, much appreciated. Originally, I wasn't thinking it quite through. I was just thinking I could grab excerpts around matches and highlight them. Then seeing the results, I decided it needed to merge overlapping excerpts into one. That regex is damn close though. If that was the first thing you posted, I would have been done right there :)

With the sample data we've been looking at, it's quite easy to get the whole text. The real page results are rather long and it's unlikely that you'd get the whole page and many stop words aren't indexed. Though I'll probably add a limit that either truncates the results if the excerpt is too big or something along that lines.

I found it an interesting (and consuming) excercise myself.


-Lee

perl digital dash (in progress)

In reply to Re^8: Regex: Matching around a word(s) by shotgunefx
in thread Regex: Matching around a word(s) by shotgunefx

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.