shawnhcorey has asked for the wisdom of the Perl Monks concerning the following question:
What is the best way to get Perl to use UTF-8 for everything, except for when I explicitly state otherwise. I was using use encoding qw( UTF-8 ); but in Perl 5.18 it's deprecated. I using this as a stop-gap:
use open qw( :encoding(utf8) ); binmode STDIN, qw{ :encoding(UTF-8) }; binmode STDOUT, qw{ :encoding(UTF-8) }; binmode STDERR, qw{ :encoding(UTF-8) };
Surely, there must be a more elegant way.
And a related question: In the HTML::TreeBuilder documentation, it says, "When you pass a filename to "parse_file", HTML::Parser opens it in binary mode, which means it's interpreted as Latin-1 (ISO-8859-1). If the file is in another encoding, like UTF-8 or UTF-16, this will not do the right thing."
What would be a good replacement for HTML::TreeBuilder, keeping in mind that not all HTML pages are XML compliant?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: UTF-8 for Everything
by Jim (Curate) on Jul 07, 2013 at 05:15 UTC | |
by locked_user sundialsvc4 (Abbot) on Jul 07, 2013 at 14:54 UTC | |
|
Re: UTF-8 for Everything
by Khen1950fx (Canon) on Jul 06, 2013 at 13:37 UTC | |
by shawnhcorey (Friar) on Jul 06, 2013 at 13:44 UTC | |
by tobyink (Canon) on Jul 06, 2013 at 15:00 UTC | |
by Anonymous Monk on Jul 06, 2013 at 23:08 UTC | |
|
Re: UTF-8 for Everything
by duelafn (Parson) on Jul 06, 2013 at 20:16 UTC | |
by shawnhcorey (Friar) on Jul 07, 2013 at 01:02 UTC | |
|
Re: UTF-8 for Everything
by vsespb (Chaplain) on Jul 06, 2013 at 21:40 UTC | |
|
Re: UTF-8 for Everything
by perl-diddler (Chaplain) on Jul 07, 2013 at 02:28 UTC | |
|
Re: UTF-8 for Everything
by locked_user sundialsvc4 (Abbot) on Jul 06, 2013 at 18:38 UTC |