Hello, monks! I'm seeking for your wisdom!

I'm getting strings from various SQL servers via DBI, and those are suppose to be utf8 russian strings. Getting utf8 strings from PgSQL is OK, but with Sybase strings in console looks ok, but when printed to browser via CGI, they're turns into ??????.

The core of the program is this:

my $dbh = DBI->connect($db{$dsn}{dsn},$db{$dsn}{user},$db{$dsn}{pa +ssword},$db{$dsn}{opts}); if (!defined($dbh)) { print "Error creating dbh: " . $DBI::errstr . "\n"; exit; } my $sth; $sth = $dbh->prepare($query); if (!$sth) { print "Error: " . $dbh->errstr . "\n"; exit; } if (!$sth->execute) { print "Error: " . $sth->errstr . "\n"; exit; } print "Content-Type: text/html; charset=utf-8\n\n"; my $ref = $sth->fetchrow_arrayref; my $str = $$ref[0]; print $db{$dsn}{driver}." ".$str ." > ".join(" ",map {sprintf("0x% +X",$_)} unpack("C*",$str))."\n";

In console all strings and bytes are the same:

# ./sql_test Content-Type: text/html; charset=utf-8 Sybase школы#Кас&#1089 +;а > 0xD1 0x88 0xD0 0xBA 0xD0 0xBE 0xD0 0xBB 0xD1 0x8B 0x23 0xD +0 0x9A 0xD0 0xB0 0xD1 0x81 0xD1 0x81 0xD0 0xB0
# ./sql_test Content-Type: text/html; charset=utf-8 Pg школы#Касс&#1 +072; > 0xD1 0x88 0xD0 0xBA 0xD0 0xBE 0xD0 0xBB 0xD1 0x8B 0x23 0xD0 0x +9A 0xD0 0xB0 0xD1 0x81 0xD1 0x81 0xD0 0xB0

But in browser:

Sybase ?????#????? > 0x3F 0x3F 0x3F 0x3F 0x3F 0x23 0x3F 0x3F 0x3F 0x3F + 0x3F
Pg школы#Касс&#1 +072; > 0xD1 0x88 0xD0 0xBA 0xD0 0xBE 0xD0 0xBB 0xD1 0x8B 0x23 0xD0 0x +9A 0xD0 0xB0 0xD1 0x81 0xD1 0x81 0xD0 0xB0

So, it's not just browser glitch with encoding, but the very $str is changed!

WHY?

UPD: Solution is here 1197669.

In reply to [SOLVED] same utf8 string is different in console and in browser (Sybase) by alexander_lunev

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.