in reply to Re: How do I regex for characters like ¾, ¼ ?
in thread How do I regex for characters like ¾, ¼ ?

Thank you for the reply. The file I am processing is a html file and I am using HTML::Parser to process it. Also I am having trouble understanding you helpful reply. I am using HTML::Parser like so ...
my $p = HTML::Parser->new( api_version => 3, start_h => [\&start, "tagname, attr"], end_h => [\&end, "tagname"], text_h => [\&text, "dtext"], marked_sections => 1, ); # Parse directly from file $p->parse_file($inputFile);
... so I have a sub text() that looks like this ...
sub text { my($origtext, $is_cdata) = @_; if ( $origtext =~ /^\s*$/ ) { return; } $origtext = "UNDEF" if !defined $origtext; $is_cdata = "UNDEF" if !defined $is_cdata; $origtext =~ s/ \& / \& /g; $origtext =~ s/½/\½/g; print $origtext; }
... so when should I do the "local $/;" call?

Replies are listed 'Best First'.
Re^3: How do I regex for characters like ¾, ¼ ?
by Anonymous Monk on Sep 10, 2007 at 04:19 UTC
    print HTML::Entities::encode( $origtext );
      Thank you masked Anonymous Monk! That does the trick.
Re^3: How do I regex for characters like ¾, ¼ ?
by GrandFather (Saint) on Sep 10, 2007 at 04:10 UTC

    You don't do the "local $/;" bit. That was just to provide a stand alone chunk of demo code using a temporary file. A sample closer to your actual issue looks like:

    use strict; use warnings; use HTML::Parser; open OUT, '>', 'delme1.txt'; print OUT <<STR; <html><head></head> <body> <p>The use of the Porter and Ale is more prevalent in England. In the United States ½ Old and ½ New Ale is usually used when this drink is called for, unless otherwise specified.</p> </body> </html> STR close OUT; my $p = HTML::Parser->new( api_version => 3, text_h => [\&text, "dtext"], ); $p->parse_file ('delme1.txt'); sub text { my($origtext) = @_; $origtext =~ s/½/\&frac12;/g; print $origtext; }

    Prints:

    The use of the Porter and Ale is more prevalent in England. In the United States &frac12; Old and &frac12; New Ale is usually used when t +his drink is called for, unless otherwise specified.

    but still doesn't show the problem you are experiencing. Perhaps you can modify the sample until it does show the error?


    DWIM is Perl's answer to Gödel