in reply to Re: HTML::TokeParser, get_text scrambling rsquo and lsquo
in thread HTML::TokeParser, get_text scrambling rsquo and lsquo
use HTML::TokeParser; use strict; local $/; my $lines = <DATA>; my $tok_par = HTML::TokeParser->new(\$lines); my $tok_inf = $tok_par->get_token ; my $tok_typ = shift @{$tok_inf}; print "Type: $tok_typ \n" ; my $title = $tok_par->get_text() || "<NO TITLE FOUND>"; print "Title: $title \n" ; __END__ <title>egrave: è : eacute: é : rsquo: ’ : lsquo: & +lsquo;</title>
I've now tested this at home, and with my web host. At home it works as it should:
Title: egrave: è : eacute: é : rsquo: ’ : lsquo: ‘
At the web host it produces the results previously described:
Title: egrave: è : eacute: é : rsquo: ’ : lsquo: ‘
In case it makes a difference, at home I have:
This is perl, v5.8.8 built for i586-linux-thread-multi
and the web host has:
This is perl, v5.8.5 built for i386-linux-thread-multi
Do you know if this behaviour is a difference between 5.8.5 and 5.8.8? Thank you for any further advice!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: HTML::TokeParser, get_text scrambling rsquo and lsquo
by shmem (Chancellor) on May 12, 2007 at 09:51 UTC | |
by tridral (Initiate) on May 12, 2007 at 10:06 UTC | |
by shmem (Chancellor) on May 12, 2007 at 10:49 UTC | |
by Joost (Canon) on May 12, 2007 at 11:05 UTC | |
|
Re^3: HTML::TokeParser, get_text scrambling rsquo and lsquo
by Krambambuli (Curate) on May 12, 2007 at 14:05 UTC | |
|
Re^3: HTML::TokeParser, get_text scrambling rsquo and lsquo
by Anonymous Monk on May 12, 2007 at 09:43 UTC | |
by tridral (Initiate) on May 12, 2007 at 10:02 UTC |