in reply to Re^3: Yet another Encoding issue...
in thread Yet another Encoding issue...

That makes sense thanks!

$reply->{'response'} = decode('UTF-8', $data{'userChat'}); seems to have done the trick on the test script...

So that's one problem solved. It seems I'm now getting encoding problems from AI::Chat, but only when I call the chat method, not when I call the prompt method. But that doesn't make a lot of sense as prompt uses chat...

I'll have to try and simplify the code and see if I can reproduce it!

Update:

This is the sort of thing I'm getting back from AI::Chat

{response: 'Ã\x83Â\x96zÃ\x83¼r dilerim, belki de sorumu yanlÃ\x84±Ã\ +x85Â\x9F s…rirken en keyif aldÃ\x84±Ã\x84Â\x9FÃ\x84±nÃ\x84±z Ã\x85 +Â\x9Fey ne?\n'}

Another Update:

{ correction: 'Turkce alfabe oldukca turaf.\n\nThe correct sentence shou +ld be: "Türkçe alfabesi oldukça tuhaf."\n\nExplanation:\n1. The word +"Türkçe" is not capitalized, it should be as it's a proper noun.\n2. +The word "alfabe" is also missing its possessive suffix, it should be + "alfabesi" to show that it belongs to Turkish language.\n3. The word + "turaf" is not a word in Turkish. The correct word meaning "strange" + or "weird" is "tuhaf".', response: 'Evet, Türk alfabesi Latin alfabesine dayanır ve 29 harf +ten oluşur. Her harfin belirli bir sesi temsil ettiği +ni biliyor muydun?' }

The correction comes from the prompt method and the characters display correctly whereas the response comes from chat and the is unreadable...

Replies are listed 'Best First'.
Re^5: Yet another Encoding issue...
by Danny (Chaplain) on Jun 01, 2024 at 22:02 UTC
    By the way, depending on what charset you are specifying in your html you may get problems. For example, the little CGI script:
    #!/bin/bash echo "Content-Type: text/html" echo "" perl -we 'use Encode; $c = encode("UTF-8", "é"); $dc = decode("UTF-8", + $c); print "\$c = $c \$dc = $dc\n"'
    displays as: $c = é $dc = é

    But if you fix the encoding like:

    #!/bin/bash echo "Content-Type: text/html; charset=UTF-8" echo "" perl -we 'use Encode; $c = encode("UTF-8", "é"); $dc = decode("UTF-8", + $c); print "\$c = $c \$dc = $dc\n"'
    it displays as: $c = é $dc = é

      I'm using UTF-8

      Content-type: text/html; charset=UTF-8 <html> <meta charset="UTF-8">