in reply to ASCII to UTF-8

Use this 2 functions:
sub latin1_to_utf8 { return( pack( 'U*', unpack("C*", @_[0] ) ) ); } sub utf8_to_latin1 { return( pack("C*", unpack("U*",@_[0]) ) ); }

Will be better if you use the modules utf8:: or Encode::. But the advantage of this 2 functions is that they works on Perl 5.6 standart, don't need to upgrade it with new modules. If you are using Perl 5.8 use utf8::, take a look in pod, at 'perlunicode' and 'utf8 manpage' (I dont put any link here because this change a lot between the version of Perl, see your release).

Don't use this:
tr///UC;
This doesn't work anymore!

* I made the changes like IlyaM says!

"The creativity is the expression of the liberty".
  • Comment on Re: ASCII (latin1) to UTF-8 (with sub latin1_to_utf8 & utf8_to_latin1)
  • Download Code

Replies are listed 'Best First'.
Re: Re: ASCII to UTF-8 (with sub ascii_to_utf8 & utf8_to_ascii )
by IlyaM (Parson) on Jul 26, 2002 at 23:27 UTC
    Please s/ascii/latin1/. Conversion from ASCII to UTF8 is nonsense. It is NOOP.

    --
    Ilya Martynov (http://martynov.org/)

Re: Re: ASCII (latin1) to UTF-8 (with sub latin1_to_utf8 & utf8_to_latin1)
by shenme (Priest) on Jul 24, 2003 at 23:41 UTC
    I was needing something cheap that would work on both 5.6.x and 5.8.x versions. (I'm trying to keep the code the same while moving back and forth between systems) Tried your utf8_to_latin1() and ran into problems on 5.8.x. The below seems to work on both versions. Thanks for the code.

    sub utf8_to_latin1 { # return( pack("U0C*", unpack( "U*",@_[0]) ) ); return( pack( "C*", unpack("U0U*",@_[0]) ) ); }


    Updated: (see sig) So I go and feed the original (commented out) line into the big program and it fails. It worked in the test program on both 5.6 and 5.8. Wander all over and find the 5.8 perluniintro where they explicitly say

    $native_string = pack("C*", unpack("U*", $Unicode_string));
    just like the original poster. But that didn't work alike on both Perl versions. Played around some more and hit upon the other variant above. This now works in both test and 'real' programs on both Perl versions. (sigh)

    --
    I'm a pessimist about probabilities; I'm an optimist about possibilities. - - Lewis Mumford