wk has asked for the wisdom of the Perl Monks concerning the following question:
I have a little script, which should from textfile input grab out every non-alpha characters and pipe then all alphas line by line to another function. I tried with [[:alpha:]] and \w/\W, but both work with diacritics (aka umlaut-chars) in one context but not in other. So i made an simple example script to show my point:
#!/usr/bin/perl use strict; use utf8; binmode STDIN, ":utf8"; binmode STDOUT, ":utf8"; my $str = "See on üks täppidega lause!"; # sample string, want to get +rid of spaces and exclamation mark, all other are alphas, and i reall +y need them # first printout ( how [[:alpha:]] works) foreach my $mrk ( split(//, $str) ) { if ($mrk =~ /^[[:alpha:]]$/ ) { print "$mrk"; } } print "\n\n"; # end of first printout # second printout ( how [[:alpha:]] doesn't works) $str =~ s/[[:^alpha:]]//ig; print "$str\n"; # end of second printout exit(0); __END__ First output: Seeonükstäppidegalause Second should be also same, but is: Seeonkstppidegalause
Why replace does not know that diacritics are also alphas? Or is there something wrong in my code? I can see workaraound (to write my own replace, for example), but i'd like to get it work in standard way.
Perl is v5.8.8, in Kubuntu 8.04.
TIA,
|
|---|