in reply to How to tokenize string by custom dictionary?
The regex engine is already trie optimized.
now that I'm back to a desktop computer let's try...
DB<105> %names =( '纯ちゃん' => 2, '周杰倫' => 57, 'Alex Fong' => 100, ) DB<106> $input=q{"Esther Kwan, 纯ちゃん | Al +ex Fong (Hong Kong) / Joe Smith ; Fong 周杰倫 Feren +c Kállai"} DB<107> $regex = join '|', keys %names DB<108> @matches = ( $input =~ /($regex)/g ) DB<110> print join ",", @matches 纯ちゃん,Alex Fong,周杰倫
here pre formated to display unicodes characters...
DB<105> %names =(
'纯ちゃん' => 2,
'周杰倫' => 57,
Alex Fong => 100,
)
DB<106> $input=q{"Esther Kwan, 纯ちゃん | Alex Fong (Hong Kong) / Joe Smith ; Fong 周杰倫 Ferenc Kállai"}
DB<107> $regex = join '|', keys %names
DB<108> @matches = ( $input =~ /($regex)/g )
DB<110> print join ",", @matches
纯ちゃん,Alex Fong,周杰倫
reading your task description again I doubt that your teacher will accept this approach as homework! =)
Cheers Rolf
( addicted to the Perl Programming Language)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: How to tokenize string by custom dictionary? (+code)
by infantcoder (Novice) on Nov 06, 2013 at 03:06 UTC | |
by LanX (Saint) on Nov 06, 2013 at 18:02 UTC | |
by infantcoder (Novice) on Nov 07, 2013 at 03:30 UTC | |
by LanX (Saint) on Nov 08, 2013 at 19:48 UTC |