I am not 100% convinced that it's really Strawberry's job. Whenever you are working with text input and output, whether through a standard file interface or through a CLI/console, you are responsible for knowing what your file encoding or console encoding/locale is. Modern Linux helps you out with LANG or LC_ALL usually defaulting to a useful UTF-8-based locale, but it's still up to someone in a Linux environment to know what that setting is
on their system; similarly, someone on a Windows system using Strawberry Perl should know (or be willing to research how to know) what codepage settings are relevant. Trying to use a character that's not available in the current encoding will always cause problems, Strawberry or not.
However, maybe it would be nice if the debugger on Strawberry (or any Windows perl build) would at least tell you the Win32::GetConsoleCP and Win32::GetConsoleOutputCP values (much like it warns you that it cannot figure out the console) -- I don't think Strawberry customizes perl5db.pl , so I believe that means that it would have to be changed in the main perl source code.
In order to make sure I'm always in UTF-8 mode, I set up the Autorun registry entry for cmd.exe years ago. This conversation prompted me to research, and I have confirmed that I can use the $PROFILE file for powershell to do something similar:
- cmd.exe sets both the ConsoleCP and ConsoleOutputCP with the chcp command. This can be made automatic using a registry value HKCU\Software\Microsoft\Command Processor\\Autorun = @chcp 65001>NUL for the current user, or the use the same value relative to HKLM for all users
- In Powershell, using chcp only changes the input encoding, not the output encoding, but there are variables you can set to change the encoding for each. To have this automatically for every Powershell console, you can create or edit the file that the variable $PROFILE resolves to, to add [Console]::InputEncoding = [Console]::OutputEncoding = New-Object System.Text.UTF8Encoding as a line in that profile.
With those, then when I launch the debugger from either command-line, the codepages are set to 65001 (UTF-8) for both input and output.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.