Chances are that the text file is not in cp1250, as you think, or that an IO layer changed the encoding during the read process..

When you open that text file with a hex editor, what are the bytes (or the byte) corresponding to the £?

(If you have a Linux system available, hexdump -C is very helpful).

Update: HTML::Template does handle decoded strings with high codepoints correctly:

$ perl -MHTML::Entities=encode_entities -wle 'print encode_entities(ch +r hex "20AC")' €

Second update: wfsp /msg'ed me that the hexdump showed A3. So let's try to simulate this:

$ perl -we 'print chr(hex "A3")'|perl -MEncode -MHTML::Entities=encode +_entities -wle 'my $x = <>; print encode_entities(decode("cp1250", $x +))' &#x141;

So, no additional characters, just a &#x141, which is the Unicode codepoint for capital L with stroke, (ie the output is correct).

So either the additional characters appear in the file, and the output is actually that you got is correct, or there's an additional IO layer somewhere that you haven't told us about (probably because you don't know about it).


In reply to Re: win32 txt (with a £) -> decode -> encode_entities -> L with stroke by moritz
in thread win32 txt (with a £) -> decode -> encode_entities -> L with stroke by wfsp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.