I am not 100% convinced that it's really Strawberry's job. Whenever you are working with text input and output, whether through a standard file interface or through a CLI/console, you are responsible for knowing what your file encoding or console encoding/locale is. Modern Linux helps you out with LANG or LC_ALL usually defaulting to a useful UTF-8-based locale, but it's still up to someone in a Linux environment to know what that setting is
on their system; similarly, someone on a Windows system using Strawberry Perl should know (or be willing to research how to know) what codepage settings are relevant. Trying to use a character that's not available in the current encoding will always cause problems, Strawberry or not.
However, maybe it would be nice if the debugger on Strawberry (or any Windows perl build) would at least tell you the Win32::GetConsoleCP and Win32::GetConsoleOutputCP values (much like it warns you that it cannot figure out the console) -- I don't think Strawberry customizes perl5db.pl , so I believe that means that it would have to be changed in the main perl source code.
In order to make sure I'm always in UTF-8 mode, I set up the Autorun registry entry for cmd.exe years ago. This conversation prompted me to research, and I have confirmed that I can use the $PROFILE file for powershell to do something similar:
- cmd.exe sets both the ConsoleCP and ConsoleOutputCP with the chcp command. This can be made automatic using a registry value HKCU\Software\Microsoft\Command Processor\\Autorun = @chcp 65001>NUL for the current user, or the use the same value relative to HKLM for all users
- In Powershell, using chcp only changes the input encoding, not the output encoding, but there are variables you can set to change the encoding for each. To have this automatically for every Powershell console, you can create or edit the file that the variable $PROFILE resolves to, to add [Console]::InputEncoding = [Console]::OutputEncoding = New-Object System.Text.UTF8Encoding as a line in that profile.
With those, then when I launch the debugger from either command-line, the codepages are set to 65001 (UTF-8) for both input and output.