Re^3: convert a string(which contains the contents of a file) into UTF-8 encoding

No modules needed. Just put the name of the input file in $qfn_in and the name of the output file in $qfn_out

#!/usr/bin/perl

use strict;
use warnings;

@ARGV == 2
   or die("usage: latin_to_utf8 infile outfile\n");

my ($qfn_in, $qfn_out) = @ARGV;

open(my $fh_in, '<:encoding(iso-8859-1)', $qfn_in)
    or die("Can't open \"$qfn_in\": $!\n");

open(my $fh_out, '>:encoding(UTF-8)', $qfn_out)
    or die("Can't create \"$qfn_out\": $!\n");

print $fh_out $_ while <$fh_in>;
[download]

Comment on Re^3: convert a string(which contains the contents of a file) into UTF-8 encoding Select or Download Code

Replies are listed 'Best First'.
Re^4: convert a string(which contains the contents of a file) into UTF-8 encoding by perlkamal (Initiate) on Oct 18, 2009 at 23:55 UTC
Hi , Actually the above code is working fine in perl 5.6 and it is not able to convert the copyright and trademark signal into utf-8 in perl 5.8 . Please advice. Regards kamalakar	[reply]
Re^5: convert a string(which contains the contents of a file) into UTF-8 encoding by ikegami (Patriarch) on Oct 20, 2009 at 01:23 UTC
While iso-latin-1 includes the Copyright symbol (©, U+00A9), it doesn't include the Trademark symbol (™, U+2122). Seeing as it's impossible to represent them in iso-latin-1, it's impossible to convert them from iso-latin-1 to UTF8. Maybe you are using Microsoft's derivative of iso-latin-1, cp1252? Update: I initially stated the Copyright symbol wasn't in iso-latin-1 either. Fixed.	[reply]