in reply to [SOLVED] -Help encode_entities doesn't seem to work
In addition to what haukex wrote: there are some items in your code which you might want to inspect.
The character literal "àéèçûîùô" can be expressed in either UTF-8 or iso-latin-1. If you save your source file containing that literal as UTF-8, then you must also use utf8;, if you save it as iso-latin-1, you must not use utf8.
Applied to your problem: In your simple example, both the characters given to encode_entities and the string to encode are in the same file, so they have the same encoding. In your longer program, the list of characters to encode is in your source code, but the characters to encode are coming from your text file. Therefore there is a chance of a mismatch, and as a first guess which explains your symptoms I'd say that your source file is saved as with UTF-8 encoding but without use utf8;.
You also should be aware that Excel files in those "old" formats are usually not UTF-8 encoded. You don't need to take care for this because Spreadsheet::ParseExcel does it for you. But you need to take care for the format you are writing: By not specifying an encoding for both your text and XML files, you get Perl's default, iso-latin-1 encoding. That doesn't hurt for the text file, since for the characters in question you should not see warnings like "Wide character in print". On the other hand, XML files are, per default or in your case per explicit declaration, UTF-8 encoded. To get that right, you should open the XML file like this:
open (XML, ">:encoding(UTF8", $xml) || die("Could not open file! $xml: '$!'");BTW: In contrast to haukex, I don't think that Devel::Peek or the UTF-8-flag are very helpful in hunting down these problems unless you are programming on XS level.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Help encode_entities doesn't seem to work
by Balawoo (Novice) on Feb 10, 2019 at 09:42 UTC | |
by haj (Vicar) on Feb 10, 2019 at 11:38 UTC | |
by Balawoo (Novice) on Feb 10, 2019 at 16:50 UTC | |
by haj (Vicar) on Feb 10, 2019 at 19:18 UTC |