in reply to Re^3: DBIx returns question marks
in thread DBIx/DBI returns question marks

OK, so locales checked = same, I've set the script to write into a text file. On dev machine I got correct chinese characters, but on staging I got question marks again

Interesting part is that spanish and german special characters work on both machines

Replies are listed 'Best First'.
Re^5: DBIx returns question marks
by Corion (Patriarch) on Aug 07, 2011 at 14:39 UTC

    Have you checked the hexdump? There is a reason why I am pointing you so hard towards the hexdump. You need to eliminate your ssh session, your terminal program and your terminal settings from the equation. It might be that your terminal displays question marks for one machine because the connection or the environment is set up differently. So, please, do not look at "the question marks" but do look at the output bytes you get.

    If the output bytes are different, then the difference lies earlier on the way the data takes through your systems. Start working your way towards the input further:

    1. Check locales and terminal encoding setup. Do they match between machines? Are they properly set up to be UTF-8? You've done that.
    2. Check the data format in your script as it writes to its output. Is it properly encoded from Unicode to the target encoding? Compare the hexdumps.
    3. Check the data format in your script as it reads the data from the database. Is it utf8 everywhere? Compare the hexdumps.
    4. Check the data format in your database in all tables. Are the tables/columns declared to be utf8/Unicode everywhere? Compare the hexdumps.
    5. Check the data format when it is written from your script to the database. Do they write utf8/Unicode everywhere? Compare the hexdumps.
    6. Check the data format from where your script gets its data. Is it properly decoded from the source to utf8 everywhere? Compare the hexdumps.

    You have to work your way through this list in either direction, but you have to inspect all data transitions where one system hands off data to the next. Displaying raw bytes to the console is such a transition, so always inspect hexdumps instead of raw bytes.

Re^5: DBIx returns question marks
by Anonymous Monk on Aug 07, 2011 at 14:44 UTC
    In addition to Corions note, instead of Dumper, try DDS or try
    sub DD { Data::Dumper->new([@_])->Useqq(1)->Dump; }