When you say you get 8 bytes where there should be 4, it would be helpful to know what the expected 4-byte sequence is supposed to be, and what the actual 8-byte sequence really is. Can you provide that?

Also, is it the case that you are sending only 4 bytes, and getting 8 bytes as a result? To put it another way, when you send N bytes, are you getting N+4, or N*2 (for any/every value of N)?

The only thing that pops to mind the in the absence of any hard evidence is that, just maybe, you are using some output method in addition to syswrite when sending data to that socket, and things are getting messed up due to the mismatch of unbuffered and buffered outputs.

update: Having just looked at the man page for Frontier::Client (thanks, ikegami), I'm very curious about what the behavior of that module is under Perl 5.6, and it would be relevant to know how you are calling its "new" method: are you specifying the "encoding" property, and if so, how? There may actually be something going on with this module's use of expat in your environment...

I gather you aren't using Encode (which was not available prior to 5.8.0, IIRC), but it's still possible that whatever is "enforcing" the utf8 encoding is erring by "upgrading" every 2-byte utf8 "wide" character as if it were single-byte-per-char 8859-1; the result of this process is to turn, e.g., "\xC3\x81" (utf8ish for Á) into "\xC3\x83\xC2\x81" -- this is a common mistake that people make with 5.8.x. But in that case, the output would still be valid utf8 (but would read like gibberish, and would probably include a lot of non-displayable "wide control" characters).

(did another update to make some improvements in the preceding paragraph)


In reply to Re: Writing utf-8 to a socket with syswrite in perl 5.6 by graff
in thread Writing utf-8 to a socket with syswrite in perl 5.6 by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.