Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^3: replacement for deprecated encoding pragma

by igoryonya (Pilgrim)
on Jul 20, 2021 at 03:20 UTC ( [id://11135186]=note: print w/replies, xml ) Need Help??


in reply to Re^2: replacement for deprecated encoding pragma
in thread replacement for deprecated encoding pragma

I am confused about this encode/decode business. I thought, that decode, converts regular characters to utf8, because I thought, that, for some reason, &refPrint doesn't pass the string in utf8, although, I thought it did, but, if I understood correctly from what you say, the &refPrint returns utf8 and decode converts utf8 to perl character string and mojolicious expects a perl character string and not utf8. That's why the terminal shows it correctly, but the mojolicious result shows scrambled, if not decoded?

In other words, decode converts specified character code to perl character string?
  • Comment on Re^3: replacement for deprecated encoding pragma

Replies are listed 'Best First'.
Re^4: replacement for deprecated encoding pragma
by haj (Vicar) on Jul 20, 2021 at 06:32 UTC
    In other words, decode converts specified character code to perl character string?

    Exactly. decode converts a string of bytes into a string of Perl characters. Perl characters have no encoding (well, of course they have some representation deep inside, but that's nothing to care about). The terminal, as everything else outside the Perl world, communicates over byte strings, and terminals nowadays expect UTF-8 encoded byte strings.

    Interfaces within Perl should, unless explicitly documented otherwise, exchange Perl characters.

Re^4: replacement for deprecated encoding pragma
by BillKSmith (Monsignor) on Jul 20, 2021 at 15:18 UTC
    Although clearly not true, it is convenient to think of perl strings as "unencoded". By default, perl assumes that your source code is in LATIN-1. When your source is loaded, literal strings are 'decoded'. As long as your editor produces LATIN-1, you can use any character supported by LATIN-1 with no special Perl code. Likewise for I/O. Perl assumes that all input is LATIN-1 and decodes it. All output is encoded into LATIN-1. Today, we are often forced to deal with files encoded in utf-8. In new code, we use the pragma use utf8 to tell perl that the source code is in utf-8 (literal strings should be decoded from that rather than the default). We can specify I/O 'layers' on our I/O 'handles' to tell them to do the necessary encode/decode as part of I/O. We rarely need anything else. Older version of perl were quite different. Note that users of pure ASCII do not care about these issues because ASCII characters are encoded the same way in both schemes.
    Bill

      By default, perl assumes that your source code is in LATIN-1

      By default, perl assumes that your source code is in ASCII. (Though string literals are 8-bit clean.)

      $ printf 'use utf8; sub bête { } say "ok";' | perl -M5.010 ok $ printf 'sub bête { } say "ok";' | iconv -t iso-8859-1 | perl -M5.010 Illegal declaration of subroutine main::b at - line 1.

      Seeking work! You can reach me at ikegami@adaelis.com

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11135186]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2024-04-16 14:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found