in reply to Re: UTF8 execute() parameters (DBI)
in thread UTF8 execute() parameters (DBI)

Thanks for that Graff.

I gave that a run and it worked fine. But I think that's because the UTF data is written into the script. It was the end of along day when I wrote that last night so I wasn't really explaining it too well.

If I actually hard code the phrase into the script:

$blah->execute($country, 'Kärnten');

the execute works fine. I only get the problem with a variable containing UTF8.

Now, my understanding is that with mysql expecting UTF8 (just like it's happy with in every other script) I should be able to just make sure $province is utf8:

if (!utf8::is_utf8($province)) { utf8::decode($province); }

I've been trying a few more things this morning. I'm printing output to a browser (so I know exactly which character set everything is in). When the browser is in UTF-8 mode, $province displays correctly, but utf8::is_utf8($province) is false. WTF? I am 100% sure $province is UTF8. It was taken from the UTF8 database as UTF8. It displays in a UTF8 browser as UTF8. but utf8::is_utf8() says it's not.

If, however, I fudge the variable so that it contains a character which does not exist in 8859-1, everything starts working normally again.

'Kärntenš' with the funny 's' character on the end reports as UTF8 from is_utf8, displays as utf8 and goes into the database without being truncated, with no other changes to the code.

Can anybody enlighten me as to what this means?

This is driving me insane and I'm going round and round in circles. I thought I'd understood all this perl UTF8 stuff because everything seemed to be working fine, and it still does if I prepare() the variables in, but just not when they're sent to DBI::execute() as arguments. ARGH!

I'm going to take a break for the sake of my sanity :-)

Replies are listed 'Best First'.
Re^3: UTF8 execute() parameters (DBI)
by MattLG (Beadle) on Apr 21, 2009 at 10:11 UTC

    Woohoo! I think I've found it! I'm still testing a few things but so far it seems to be working.

    I noticed that the variable containing the UTF chars was decoded using utf8::decode() without first checking it with utf8::is_utf8(). It turns out that it was already decode()ed, so I was in fact double decode()ing. This seems to just remove the utf8 attribute and somehow put it in a very wierd sometimes UTF8/sometimes not UTF8 state.

    Anyway, I've corrected this now and it seems to be working.

    What *should* happen to a double decode()ed string? I would have thought that nothing would happen, but clearly that's not the case. I guess this is just a usage error.

    Any ideas why a double decode()ed string works fine where prepare()ed, but not when sent as an execute() argument?

    Cheers guys. I really appreciate your masterful help!

    MattLG