ait has asked for the wisdom of the Perl Monks concerning the following question:
Pardon my ignorance on some of these internals, but I am going crazy transferring data through 3 different databases in different charsets through different layers (ssh, file transfers, direct SQL), etc.
Context: Trying to figure out why PHP::Serialization reports 30 as the string length of the 26 char string. When I serialize to the PHP Array I get this:
s:30:"Triple “S” Industrial Corp"
So I am trying to figure out if the bug is in the PHP::Serialization or somewhere else in this crazy 3 system interface. The PHP on the target server is 7.2.10 so I am assuming it supports these UTF chars w/o issue. But what seems strange to me is that both Perl and PHP would both internally represent 30 in character length? So before I dive into that module's code to try to understand what it's doing, I want to first understand how Perl stores this internally..
So given this string: Triple “S” Industrial Corp (note funky quotes), this is the Dump:
SV = PV(0x5584829062e0) at 0x558482ad2ee0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x558482b75b30 "Triple \342\200\234S\342\200\235 Industrial Corp"\0 CUR = 30 LEN = 56 COW_REFCNT = 0
What are the characters \342\200\234 (the left funky quote)?
How would I manually decode them if I wanted to ? (i.e. is this a utf8 sequence? how do I know what they mean?)
Is this is why CUR reports 30 "perl characters" instead of 26 actual characters?
|
---|