Re^2: convert a string(which contains the contents of a file) into UTF-8 encoding

Replies are listed 'Best First'.
Re^3: convert a string(which contains the contents of a file) into UTF-8 encoding by ikegami (Patriarch) on Sep 29, 2009 at 23:42 UTC
No modules needed. Just put the name of the input file in `$qfn_in` and the name of the output file in `$qfn_out` `#!/usr/bin/perl use strict; use warnings; @ARGV == 2 or die("usage: latin_to_utf8 infile outfile\n"); my ($qfn_in, $qfn_out) = @ARGV; open(my $fh_in, '<:encoding(iso-8859-1)', $qfn_in) or die("Can't open \"$qfn_in\": $!\n"); open(my $fh_out, '>:encoding(UTF-8)', $qfn_out) or die("Can't create \"$qfn_out\": $!\n"); print $fh_out $_ while <$fh_in>;` [download]	[reply] [d/l] [select]
Re^4: convert a string(which contains the contents of a file) into UTF-8 encoding by perlkamal (Initiate) on Oct 18, 2009 at 23:55 UTC
Hi , Actually the above code is working fine in perl 5.6 and it is not able to convert the copyright and trademark signal into utf-8 in perl 5.8 . Please advice. Regards kamalakar	[reply]
Re^5: convert a string(which contains the contents of a file) into UTF-8 encoding by ikegami (Patriarch) on Oct 20, 2009 at 01:23 UTC
While iso-latin-1 includes the Copyright symbol (©, U+00A9), it doesn't include the Trademark symbol (™, U+2122). Seeing as it's impossible to represent them in iso-latin-1, it's impossible to convert them from iso-latin-1 to UTF8. Maybe you are using Microsoft's derivative of iso-latin-1, cp1252? Update: I initially stated the Copyright symbol wasn't in iso-latin-1 either. Fixed.	[reply]