in reply to Re: DBIx returns question marks
in thread DBIx/DBI returns question marks

I've just tried connecting staging machine to dev DB and in didn't work. "?????" all over the place.

Replies are listed 'Best First'.
Re^3: DBIx returns question marks
by Corion (Patriarch) on Aug 07, 2011 at 11:26 UTC

    You never explicitly encode your output when you print it, so my guess is that you either have different locales on the machines, or use different terminal settings between the two machines. You can check whether the output of your program is good by hexdumping the output instead of printing raw bytes:

    perl -w myprogram | od -x

    If the output is the same between the two environments, then the problem is in your terminal settings, installed fonts or something like that. If the output is different, one problem happens somewhere earlier, maybe when writing to the database, reading from the database or outputting the data from your program.

      OK, so locales checked = same, I've set the script to write into a text file. On dev machine I got correct chinese characters, but on staging I got question marks again

      Interesting part is that spanish and german special characters work on both machines

        Have you checked the hexdump? There is a reason why I am pointing you so hard towards the hexdump. You need to eliminate your ssh session, your terminal program and your terminal settings from the equation. It might be that your terminal displays question marks for one machine because the connection or the environment is set up differently. So, please, do not look at "the question marks" but do look at the output bytes you get.

        If the output bytes are different, then the difference lies earlier on the way the data takes through your systems. Start working your way towards the input further:

        1. Check locales and terminal encoding setup. Do they match between machines? Are they properly set up to be UTF-8? You've done that.
        2. Check the data format in your script as it writes to its output. Is it properly encoded from Unicode to the target encoding? Compare the hexdumps.
        3. Check the data format in your script as it reads the data from the database. Is it utf8 everywhere? Compare the hexdumps.
        4. Check the data format in your database in all tables. Are the tables/columns declared to be utf8/Unicode everywhere? Compare the hexdumps.
        5. Check the data format when it is written from your script to the database. Do they write utf8/Unicode everywhere? Compare the hexdumps.
        6. Check the data format from where your script gets its data. Is it properly decoded from the source to utf8 everywhere? Compare the hexdumps.

        You have to work your way through this list in either direction, but you have to inspect all data transitions where one system hands off data to the next. Displaying raw bytes to the console is such a transition, so always inspect hexdumps instead of raw bytes.

        In addition to Corions note, instead of Dumper, try DDS or try
        sub DD { Data::Dumper->new([@_])->Useqq(1)->Dump; }