in reply to Highlighting Regex Hits

An alternative way is to use split, which has an (IMO) not often used feature - it can return both what was matched, and what was between:
my $line = 'This line 1 has a hit here and a hit there.'; my $word = 'hit'; my $count = 0; my $n = 0; my @stuff = split m/($word)/, $line; grep { $n++; if ($n % 2) { print $_; } else { print RED, $_, RESET; $c +ount++; } } @stuff; print "\nFound $count times.\n";
I found, that this scales better, then running m// or s/// trough while loop, on big strings. Also handy if You need to return modified string (split + join), instead of printing it's parts.

Replies are listed 'Best First'.
Re^2: Highlighting Regex Hits
by ww (Archbishop) on May 15, 2010 at 15:05 UTC

    Couple quibbles:

    1. re "not often used" is actually fairly common; it's been cited in at least two nodes in the past couple days
    2. and re print RED,... my 5.10.1 under *n*x pukes on this (sees "RED" as a filehandle, illegally followed by a comma). Since ikegami has already referred OP to the docs on ANSI, please take this merely as an explanation of why I've used square-brackets rather than colorizing (we won't mention "lazy" here).

    But a more substantive issue (perhaps) lurks in your split where your version will match "hit," "Hitachi," and many others including the vulgar word below (at Note 1):

    #!/usr/bin/perl use strict; use warnings; # 840126 my @line = ('Not here: line 1', 'This line 2 has a hit here and a hit there.', 'hit me, hit me, bust me in line 3!', "Don't throw a shitfit over that hit in line 4.", # *Note + 1 'Line 5: my search-word does not exist here.'); my $word = 'hit'; my $total_count = 0; for my $line(@line) { my $count = 0; my $n = 0; my @stuff = split m/(\b$word\b)/, $line; # grep { $n++; if ($n % 2) { print $_; } else { print RED, $_, RESET +; $count++; } } grep { $n++; if ($n % 2) { print $_; } else { print "\t[ $_ ]"; $c +ount++; } } @stuff; print "\nFound $count times in the preceding line.\n"; $total_count += $count; } print "Total count: $total_count\n"; =head execution: ww@GIG:~/pl_test$ perl 840126.pl Not here: line 1 Found 0 times in the preceding line. This line 2 has a [ hit ] here and a [ hit ] there. Found 2 times in the preceding line. [ hit ] me, [ hit ] me, bust me in line 3! Found 2 times in the preceding line. Don't throw a shitfit over that [ hit ] in line 4. Found 1 times in the preceding line. Line 5: my search-word does not exist here. Found 0 times in the preceding line. Total count: 5 ww@GIG:~/pl_test$ =cut

    IOW, I may have overwritten this, but using the word boundary metacharacter to restrict your matches (as in my line 19) is often a good idea.