in reply to Parsing UTF-8 HTML w/ HTML::Parser
You're double encoding it as utf8. Here's a working snippet to play with-
use warnings; use strict; use HTML::TreeBuilder; use WWW::Mechanize; use Encode; my $mech = WWW::Mechanize->new( agent => "iEatYourFaceBot/666" ); $mech->get("http://www.businesswire.com/portal/site/qsr/permalink/?ndm +ViewId=news_view&newsId=20100622005402"); my $html = HTML::TreeBuilder->new_from_content($mech->content); my $title = $html->look_down( sub{ $_[0]->tag() eq 'title' } ); print encode_utf8($title->as_text), $/; __DATA__ yields: Chicagoland and Northwest Indiana McDonald’s® Offer a Free Tas +te of McCafé at the Taste of Chicago
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Parsing UTF-8 HTML w/ HTML::Parser
by Purdy (Hermit) on Jun 24, 2010 at 18:27 UTC | |
by Anonymous Monk on Jun 25, 2010 at 02:40 UTC | |
|
Re^2: Parsing UTF-8 HTML w/ HTML::Parser
by Anonymous Monk on Jun 24, 2010 at 00:15 UTC | |
|
Re^2: Parsing UTF-8 HTML w/ HTML::Parser
by Purdy (Hermit) on Jun 24, 2010 at 18:49 UTC | |
by ikegami (Patriarch) on Jun 24, 2010 at 18:57 UTC | |
by Purdy (Hermit) on Jun 24, 2010 at 20:22 UTC |