comment on

First, an interesting point about the regex... Within a character class, \b matches a backspace, rather than a word-boundary. [\W\b] will match either a non-word character or a backspace character (which is a non-word character anyway).

I would actually use lookbehind and lookahead to make the replacement simpler: s,(?<!\w)($worda)(\W+)($wordb)(?!\w),$1$2$3,i Next, I'm trying to figure out what makes s|||g; s|||g; necessary. Because your regex only allows non-word characters between word A and word B, and  and  each contain a word character, once bold tags are put around a word that word should never be matched by your regex again. Ah... Unless your material may already contain some bold tags before you do any of the substitutions. Then you could end up with doubled tags to remove.

Finally, here's how I would try to do this more efficiently. I would combine @material into a single string, perform the substitutions, and then split back to @material.

my $material = join "\0", @material;

foreach $phrase (@key_phrases) {
    my($worda, $wordb) = split / /, $phrase;

    $material =~ s{(?<!\w)($worda)([^\w\0]+)($wordb)(?!\w)}
                  {<B>$1</B>$2<B>$3</B>}i;
}

@material = split /\0/, $material;
[download]

As you can see, I'm using "\0" as a temporary divider between pieces of @material; I've updated the regex to make sure matches don't overlap two pieces.

I considered also building a single regex to match all the key phrases, but since each phrase appears only once I don't know if that would be more efficient.

In reply to Re: foreach (@array) s/x/y/ efficiency by chipmunk
in thread foreach (@array) s/x/y/ efficiency by gryphon

Posts are HTML formatted. Put   tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.