in reply to Re^7: Converting Unicode
in thread Converting Unicode

Well, these may not be in core, but for one thing database operations have always been tricky, and (to my knowledge) no "flag" at the top of one's code ever solved that, e.g. with DBI or DBD::mysql. Despite the fact that input/output from a database might be thought by the coder to be part of the overall I/O for the purposes of encoding, it isn't treated as such, and must be dealt with separately. The handoff between Perl and the DB had to ensure that both were on the same page with the encoding, and for the programmer, keeping track of whether or not a particular item had been encoded or decoded was always a burden, as it was quite possible to overdo either one--Perl would happily allow this (to dastardly results). Then there's other external modules such as CGI, etc. CGI was in core, but it was never UTF8 by default. It also had to be given special instructions to enable and/or convert to utf8 for such things as HTML form input/output. There seem to be many hidden gotchas with coding for unicode, which is why the coder must be alert and prepared for these all throughout the process. "Wide characters" tend to show up when least expected, and can really make a confusing mess of things.

Blessings,

~Polyglot~

Replies are listed 'Best First'.
Re^9: Converting Unicode
by LanX (Saint) on Dec 04, 2023 at 21:36 UTC
    Expecting the programming language to magically default HTML or relational databases to UTF-8 is quite a stretch.

    Like expecting that human programmers automatically default to octal system to avoid future rounding errors with floats.

    It's just outside the realm of the programming language.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

      No, I don't think it's a stretch. It will happen in due time. Soon everyone will be working with UTF8 by default, or some near equivalent of it (utf8mb4?). I think it might already be the standard if it weren't resisted by those slow to adopt it.

      Blessings,

      ~Polyglot~

        > with UTF8 by default, or some near equivalent of it (utf8mb4?)

        And you are contradicting yourself by proposing two defaults at the same time.

        So 5 years from now on someone like you will complain the wrong one was chosen.

        Neither HTML nor table/db/schema encodings are part of the language and magically changing their settings/headers would not only require new modules and procedures.

        (Not sure if CGI.pm is even actively developed anymore.)

        It would also mean magically overriding the configs of the Web/DB servers.

        > I don't think it's a stretch

        Well, who else thinks like you?

        Which language does it like you propose?

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        see Wikisyntax for the Monastery