Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

transliterating.. char-conversation

by raptor (Sexton)
on Aug 08, 2001 at 00:18 UTC ( [id://102893]=perlquestion: print w/replies, xml ) Need Help??

raptor has asked for the wisdom of the Perl Monks concerning the following question:

hi, I had to convert several symbols from German alphabet to different represantation... and used something like this until now :
sub blah { $_ = shift; s/ä/ae/g; s/Ä/AE/g; s/ö/oe/g; s/Ö/OE/g; s/ü/ue/ +g; s/Ü/UE/g; s/ß/ss/g; $_ }

But I now need more complicated conversation so I started rewriting(see code below)... but after I thougt a litle bit I decided first to ask, may be there is easier way... these language depenend stuff has always been a black-box to me.. ...reading pelrlocale and POSIX to clear out the things a little bit... may be u can help me..
This just from the scratch I even not tested it .. on the NT it seems that dosen't work.. (going to Linux console :") )
$tstr = 'abc äÄöÖüÜß test'; %subsTable = ( 'ä' => 'ae', 'Ä' => 'AE', 'ö' => 'oe', 'Ö' => 'OE', 'ü' => 'ue', 'Ü' => 'UE', 'ß' => 'ss' ); $notInSQL = qr{'|`|´|"|,|%|$}; $inSQL = qr{`|´|"|,|$}; sub blah { my $str = shift; my $regex = join "|", map quotemeta, keys %subsHash; print $regex,"\n"; $regex = qr/$regex/; $str =~ s/($regex)/$subsHash{$1}/g; # if ($str =~ /^\s*(SELECT|INSERT|UPDATE|DELETE)/i ) { $str =~ s/$i +nSQL/ /g } # else { $str =~ s/$notInSQL/ /g }; #may be have to use tr/// here return $str }; print "orig : $tstr\n"; print blah($tstr)."\n";
these are not all symbols to be converted I'm expecting more to come...

Replies are listed 'Best First'.
(Ovid) Re: transliterating.. char-conversation
by Ovid (Cardinal) on Aug 08, 2001 at 00:32 UTC

    Without looking at your code too closely, I see some areas for improvement. As a good rule of thumb, don't alternate on single characters. By adding strict (and switching the alternation to character classes for a performance boost), I got your script to work:

    use strict; my $tstr = 'abc äÄöÖüÜß test'; my %subsTable = ( 'ä' => 'ae', 'Ä' => 'AE', 'ö' => 'oe', 'Ö' => 'OE', 'ü' => 'ue', 'Ü' => 'UE', 'ß' => 'ss' ); my $notInSQL = qr{['`´",%\$]}; my $inSQL = qr{[`´",\$]}; sub blah { my $str = shift; my $regex = join "", map { quotemeta $_ } keys %subsTable; $regex = "[$regex]"; print "Regex: $regex\n"; $str =~ s/($regex)/$subsTable{$1}/g; return $str }; print "orig : $tstr\n"; print blah($tstr)."\n";

    When I added strict, I immediately discovered that you had referred to %subsTable as %subsHash in your subroutine. By switching the variable name, it worked fine.

    Cheers,
    Ovid

    Vote for paco!

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

      thanx ...
      i'm doing this in write-and-run script and later all will be me moved in module where I'm always using 'strit' of course :").. but see it is needed even in write-and-run scripts
      thanx for char-classes I forgot about them :")
Re: transliterating.. char-conversation
by raptor (Sexton) on Aug 08, 2001 at 00:27 UTC
    oops my mistake :
    it has to be %subsHash instead of %subsTable....:")
    but never mind a better solutions are always welcome...
    also some discusion about locales-and-friends...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://102893]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2024-03-29 07:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found