vit has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
Does anybody know a package or a piece of code which can highlight text (not language syntax).
Basically what I need is having a text string which contains regular text attributes (letters, spaces, commas, ....) to highlight those words which match a given query by inserting HTML bolds.
Say I for query "car repair" a phrase will be highlighted as "best car wash and repair!!".
  • Comment on Text highlighting (not language syntax)

Replies are listed 'Best First'.
Re: Text highlighting (not language syntax)
by ikegami (Patriarch) on Jan 22, 2010 at 22:57 UTC
    Hum, not easy. You need a parser. This one should do the trick if you're starting from text.
    use HTML::Entities qw( encode_entities ); sub text_to_html { my $terms = $_[1]; my ($terms_pat) = map "(?:$_)", join '|', map quotemeta, @$terms; ( my $html = $_[0] ) =~ s{ ( (?: (?!$terms_pat). )* ) ( $terms_pat+ ) | ( .+ ) }{ defined($1) ? encode_entities("$1") . "<b>" . encode_entities("$2") . "</b>" : encode_entities("$3") }xseg; return $html; } print text_to_html("best car wash & repair!!", [qw( car repair )]);

    Starting from HTML is much trickier.

    Update: Simplified code.
    Update: Fixed s// </b> / problem mentioned in a reply.

      Thanks, I am starting from text. I tested and got
      Undefined subroutine &main::escaped_entities called at test_highlight. +pl line 24.
        Should have been encode_entities, and it's available from HTML::Entities. See my updated code.
Re: Text highlighting (not language syntax)
by zentara (Cardinal) on Jan 23, 2010 at 11:44 UTC