in reply to Re^3: phrase match
in thread phrase match
That is useful sometimes, but here it's not needed, because a lookahead is enough.
Run this:
use warnings; $sentence='kinase inhibitor SET6 activates p16(INK4A) in cell-wall.'; my @phrases = ('kinase i', 'inhibitor', 'tor SET6', 'SET6', 'p16(INK4A +)', 'cell'); my $phrases_re = join '|', map { quotemeta } @phrases; $sentence =~ s/(^| )($phrases_re)(?= |$)/$1#$2#/g; print $sentence, "\n";
You get the output
kinase #inhibitor# #SET6# activates #p16(INK4A)# in cell-wall.
Update: There are ways to do this kind of thing without lookaheads or lookbehinds, just as a curiosity. Replace the substitution statement above with either
or$sentence =~ s/(^| )($phrases_re)( |$)/$1#$2#$3/g for 0, 1;
use 5.010; given ($sentence) { s/ / /g; s/(^| )($phrases_re)( |$)/$1# +$2#$3/g; s/ / /g; }
Update: One more alternative is below.
my %phrase; $phrase{$_}++ for @phrases; my @sentence = split /( +)/, $sentence; for (@sentence) { $phrase{$_} and $_ = "#" . $_ . "#"; }; $sentence = join "", @sentence;
Update: Oh, let's not forget this one either.
$sentence =~ s/(?<![^ ])($phrases_re)(?= |$)/#$1#/g;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: phrase match
by JadeNB (Chaplain) on Dec 13, 2009 at 18:29 UTC |