larimar123 has asked for the wisdom of the Perl Monks concerning the following question:
Code that works, but not with my special characters:
Trial 1 RegEx using full character name:
B. This works:#!/usr/bin/perl use utf8; use charnames ':full'; while ($line=<>){ @array = split(/ /, $line); foreach $x (@array){ if ($x=~ /\N{MODIFIER LETTER GLOTTAL STOP}/){ #full character +name for a glottal stop print "$x\n"; } } }
#!/usr/bin/perl use utf8; use charnames ':full'; while ($line=<>){ @array = split(/ /, $line); foreach $x (@array){ if ($x=~ /\N{LATIN SMALL LETTER K}/){ #full character +name for a 'k' print "$x\n"; } } }
B: Trial that actually works ('k' instead of glottal stop):#!/usr/bin/perl use utf8; use charnames ':full'; while ($line=<>){ @array = split(/ /, $line); foreach $x (@array){ if ($x=~ /\x{02c0}/){ #code for glottal stop print "$x\n"; } } }
#!/usr/bin/perl use utf8; use charnames ':full'; while ($line=<>){ @array = split(/ /, $line); foreach $x (@array){ if ($x=~ /\x{006b}/){ #code for lower case 'k' print "$x\n"; } } }
Correct Output:#!/usr/bin/perl use utf8; use charnames ':full'; while ($line=<>){ @array = split(/ /, $line); foreach $x (@array){ if ($x=~ /\N{MIDDLE DOT}/){ print "$x\n"; } } }
perl oneidaregex.pl oneidafull.txt Né· niwakkaló·tʌ. niyawʌ́·u. Né· kwí·
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Regular Expressions on Unicode
by moritz (Cardinal) on Dec 14, 2009 at 09:48 UTC | |
by larimar123 (Initiate) on Dec 14, 2009 at 10:47 UTC | |
by moritz (Cardinal) on Dec 14, 2009 at 19:19 UTC | |
by Anonymous Monk on Dec 14, 2009 at 15:09 UTC | |
by Anonymous Monk on Dec 14, 2009 at 15:20 UTC | |
by ikegami (Patriarch) on Dec 14, 2009 at 18:25 UTC | |
|
Re: Regular Expressions on Unicode
by ikegami (Patriarch) on Dec 14, 2009 at 18:42 UTC | |
|
Re: Regular Expressions on Unicode
by graff (Chancellor) on Dec 14, 2009 at 22:08 UTC |