Polyglot has asked for the wisdom of the Perl Monks concerning the following question:

Suppose we wish to highlight search terms returned from a user's query. Now, suppose the user has entered a regular expression, all of which must have matched to be returned from the query, but which also contains captured groups in the expression, and we wish to highlight only those terms in the results.

For example:
my $text = 'This is just an arbitrary example of text.'; #ONLY TWO WORDS ARE CAPTURED--WE WANT TO HIGHLIGHT ONLY THOSE my $query = qr~(?:(?:This)|(?:That)).*?(just).*?(arbitrary).*?$~; #THIS WOULD HIGHLIGHT THE ENTIRE LINE $text =~ s~($query)~<span class="highlight">$1</span>~g;

How could we upgrade that last line to where it highlighted only the expressions that had been properly captured, i.e. "arbitrary" and "just" in this example?


NOTE: For compatibility purposes, it needs to work with Perl 5.12.4. This excludes the use of the @{^CAPTURE} variable that was not made available until Perl 5.25.7.

Blessings,

~Polyglot~

Replies are listed 'Best First'.
Re: Highlighting only captured groups from larger regex
by Corion (Patriarch) on Dec 12, 2022 at 21:55 UTC

    You should be able to use @- and @+ to find the text that matched - these have been around since 5.004 or so :)

Re: Highlighting only captured groups from larger regex
by tybalt89 (Monsignor) on Dec 12, 2022 at 22:25 UTC
    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11148806 use warnings; my $text = 'This is just an arbitrary example of text.'; #ONLY TWO WORDS ARE CAPTURED--WE WANT TO HIGHLIGHT ONLY THOSE my $query = qr~(?:(?:This)|(?:That)).*?(just).*?(arbitrary).*?$~; #THIS WOULD HIGHLIGHT THE ENTIRE LINE #$text =~ s~($query)~<span class="highlight">$1</span>~g; $text =~ $query and do { for ( reverse 1 .. $#- ) { substr $text, $+[$_], 0, '</span>'; substr $text, $-[$_], 0, '<span class="highlight">'; } }; print $text, "\n";

    Outputs:

    This is <span class="highlight">just</span> an <span class="highlight" +>arbitrary</span> example of text.
      Excellent! I couldn't figure out the "substr" part in the online examples I was seeing, and this particular issue seemed to have an added twist at that. This was what I needed to see. I begin to understand, now, what the "substr" is doing.

      Thank you.

      Blessings,

      ~Polyglot~

        > I couldn't figure out the "substr" part in the online examples

        please note that you can alternatively use the substr function as an lvalue

        substr EXPR,OFFSET,LENGTH = REPLACEMENT

        which should be easier to figure out.

        Cheers Rolf
        (addicted to the 𐍀𐌴𐍂𐌻 Programming Language :)
        Wikisyntax for the Monastery

Re: Highlighting only captured groups from larger regex
by hippo (Archbishop) on Dec 12, 2022 at 21:46 UTC

    One way would be with @{^CAPTURE}.

    use strict; use warnings; use Test::More tests => 1; my $text = 'This is just an arbitrary example of text.'; my $query = qr~(?:(?:This)|(?:That)).*?(just).*?(arbitrary).*?$~; my @want = (qw/just arbitrary/); $text =~ $query; is_deeply \@{^CAPTURE}, \@want;

    You will need Perl 5.25.7 or newer for this approach but TIMTOWTDI.


    🦛

Re: Highlighting only captured groups from larger regex
by rsFalse (Chaplain) on Dec 13, 2022 at 11:53 UTC
    Similar solution, but not with single regex:
    #!/usr/bin/perl use strict; use warnings; # https://perlmonks.org/?node_id=11148806 my $text = 'This is arbitrary just an arbitrary just example of text j +ust arbitrary.'; @_ = qw( ju.. arb.{1,4}ary ); $text =~ m/ (?:(?:This)|(?:That)) .*? (??{ $_[ 0 ] }) .*? (??{ $_[ 1 ] }) /x and $text =~ s/ (??{ "(*FAIL)" x ! @_ }) .*? \K ( (??{ $_[ 0 ] }) ) (?{ shift @_ }) /<<<$1>>>/gx; print $text, "\n";
    OUTPUT (excluding warnings):
    This is arbitrary <<<just>>> an <<<arbitrary>>> just example of text j +ust arbitrary.
    P.S. capture group is redundant.