eff_i_g has asked for the wisdom of the Perl Monks concerning the following question:
The result with 1:use warnings; use strict; use Encode; use XML::Twig; use HTML::Table; undef $/; my $html = <DATA>; my $twig = XML::Twig->new( input_filter => sub { my $txt = shift; return decode('Windows-1252', $txt); }, output_text_filter => 'safe_hex', pretty_print => 'indented', twig_print_outside_roots => 1, twig_roots => { 'h1' => sub { ### If 1: ### - "Wide character in print..." ### - TM is *not* converted to ™ by output_text_fi +lter ### If 0: ### - No character complaints ### - TM *is* converted if (1) { my $new_elt = XML::Twig::Elt->new('div'); my $table = HTML::Table->new([[1,2]]); $new_elt->set_inner_html($table->getTable()); $new_elt->paste(before => $_); } $_->flush(); } }, ); $twig->parse_html($html); __DATA__ <html> <body> <h1>Such and Such™</h1> </body> </html>
The result with 0:<?xml version="1.0" encoding="iso-8859-1"?><html><head></head><body> <div> <table> <tbody> <tr> <td>1</td> <td>2</td> </tr> </tbody> </table> </div> Wide character in print at /usr/local2/lib/perl_aps/XML/Twig.pm line 7 +662, <DATA> chunk 1. <h1>Such and Suchâ¢</h1> </body> </html>
Q: Why don't I/How can I get:<?xml version="1.0" encoding="iso-8859-1"?><html><head></head><body> <h1>Such and Such™</h1> </body> </html>
Thanks, this had been puzzling me all afternoon!<?xml version="1.0" encoding="iso-8859-1"?><html><head></head><body> <div> <table> <tbody> <tr> <td>1</td> <td>2</td> </tr> </tbody> </table> </div> <h1>Such and Such™</h1> </body> </html>
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: XML::Twig, HTML::Table, and wide characters
by ikegami (Patriarch) on Jan 15, 2009 at 22:37 UTC | |
Re: XML::Twig, HTML::Table, and wide characters
by mirod (Canon) on Jan 16, 2009 at 14:32 UTC | |
by eff_i_g (Curate) on Jan 16, 2009 at 15:58 UTC | |
Re: XML::Twig, HTML::Table, and wide characters
by eff_i_g (Curate) on Jan 16, 2009 at 18:27 UTC |