OK, I hope I have understood you correctly: you want to specify a sentence and construct that sentence from within a given HTML page (or whatever). Like:

$message = 'I\'d like to kick Bill Gates\' ass'; $page = 'http://www.microsoft.com'; highlight_letters($message, $page);

Then you surround the relevant letters with tags that make them blink red when the user moves the mouse. Or whatever.

You've got two choices. Use a series of simple regexes, and loop through the text looking for a word at a time, or use one huge gnarly regex. I suspect the latter is the way to go. The reason is that you need to do things like backtracking - suppose you get to the end of the text, you've found whole words for all of your sentence but then you can't find the last word... you need to backtrack, try using letter matches in some place, and hope that you'll win yourself more space to find the last word. E.g.

$message = "foo foo"; $page_text = "f o o foo"; # whole word matching failed! We need to let +ter-match the first foo!

Now the regex engine has this stuff built in.

To build the regex... phew. I think you'll need a lot of lookahead and lookbehind expressions, and experimental features (see perlre). The logic is gonna be recursive, and looks like this:

Sentence S "matches" page P if
(the first word W of S has a match at position M in P or
W's letters have matches in P, where M is the last matching letter)
and S' (S minus its first word) "matches" P' (the portion of P after M)

Doing this is hellishly tricky with one big regex, but may save you a lot of extra coding to implement your own backtracking logic. You'll certainly be a regex whizz by the end.

dave hj~


In reply to Re: Highlighting a text inside another one. by dash2
in thread Highlighting a text inside another one. by TomSW

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.