in reply to replacement for deprecated encoding pragma
Sorry, encoding of a string is a thing that has to be made explicit, and refPrint returns byte strings with no encoding information instead of Unicode strings containing wide characters that Mojolicious seems to be expecting. Are you allowed to change refPrint to return wide strings? Write a wrapper like sub refPrintW { decode utf8 => refPrint(@_) }?
Re^2: replacement for deprecated encoding pragma
by BillKSmith (Monsignor) on Jul 19, 2021 at 20:35 UTC
|
I suspect that you have it backwards. The function refPrint returns a string of html encoded as utf-8. The template expects a normal perl character string. (The function decode decodes the utf-8 into the required perl character string.) I do not understand why, but seems that STDOUT must print the utf-8 characters to the terminal without any further encoding.
If you really can salvage your legacy software this easily, why not do it? Be sure to document the need for it and how you found it.
| [reply] [d/l] [select] |
|
I am confused about this encode/decode business. I thought, that decode, converts regular characters to utf8, because I thought, that, for some reason, &refPrint doesn't pass the string in utf8, although, I thought it did, but, if I understood correctly from what you say, the &refPrint returns utf8 and decode converts utf8 to perl character string and mojolicious expects a perl character string and not utf8. That's why the terminal shows it correctly, but the mojolicious result shows scrambled, if not decoded?
In other words, decode converts specified character code to perl character string?
| [reply] |
|
In other words, decode converts specified character code to perl character string?
Exactly. decode converts a string of bytes into a string of Perl characters. Perl characters have no encoding (well, of course they have some representation deep inside, but that's nothing to care about). The terminal, as everything else outside the Perl world, communicates over byte strings, and terminals nowadays expect UTF-8 encoded byte strings.
Interfaces within Perl should, unless explicitly documented otherwise, exchange Perl characters.
| [reply] [d/l] |
|
Although clearly not true, it is convenient to think of perl strings as "unencoded". By default, perl assumes that your source code is in LATIN-1. When your source is loaded, literal strings are 'decoded'. As long as your editor produces LATIN-1, you can use any character supported by LATIN-1 with no special Perl code. Likewise for I/O. Perl assumes that all input is LATIN-1 and decodes it. All output is encoded into LATIN-1. Today, we are often forced to deal with files encoded in utf-8. In new code, we use the pragma use utf8 to tell perl that the source code is in utf-8 (literal strings should be decoded from that rather than the default). We can specify I/O 'layers' on our I/O 'handles' to tell them to do the necessary encode/decode as part of I/O. We rarely need anything else. Older version of perl were quite different. Note that users of pure ASCII do not care about these issues because ASCII characters are encoded the same way in both schemes.
| [reply] [d/l] |
|
Re^2: replacement for deprecated encoding pragma
by igoryonya (Pilgrim) on Jul 20, 2021 at 03:25 UTC
|
So, what you are saying, there is no replacement for an automatic "use encoding..." pragma whatsoever.
Perl developers deprecated it and replaced it with nothing? | [reply] |
|
It is has not been replaced by nothing. What has been taken away, without replacement, is the feature to write use encoding 'ISO-8859-5'; and then use cyrillic characters in that encoding in character literals of your source code. But you didn't do that, you declared UTF-8, and there is a replacement for that:
There is use utf8; which is the equivalent of use encoding 'utf8';
Note that (precisely: Since Perl 5.8.2) neither of those affects how your program reads and writes text: They are used to declare that your source code is encoded as UTF-8. So it mostly affects string literals in your code, but not templates or anything else your program reads or prints to.
If you want to set a default encoding, have a look at open. If you write use open ':encoding(UTF-8)'; then every calls to open within the lexical scope of the open pragma will be UTF-8 encoded.
| [reply] [d/l] [select] |
|
| [reply] [d/l] [select] |
|
|