Well, since you asked...

I'm using it to debug a problem. I suspect using utf::downgrade is the solution, but I want to verify that.

Specifically, I am using LWP to make https connections and LWP is calling Crypt::SSLeay. I'm trying to track down a problem which I think is being caused by the header having the UTF8 flag set (even though it contains only ASCII characters).

The version of LWP I'm using is either very ancient or has been hacked -- it writes the header separately from the body. Newer versions of LWP concatenate the header and body before writing out the request. In my case this would immediately cause corruption when the body contained non-ASCII byte values.

However, I'm only seeing corruption some of the time. My suspicion is that the corruption is occurring somewhere in the bowls of Crypt::SSLeay -- something like:

1. LWP syswrites out the header (with UTF8 flag set)is written 2. Crypt::SSLeay writes out encrypted header 3. LWP syswrites out the body 4. Crypt::SSLeay writes out the encrypted body but perhaps an internal + buffer that it uses has the UTF8 flag set which corrupts the body.

The corruption is consistent with bytes with the high-bit set getting converted to their UTF8 encoding.

I've noticed that the most recent of LWP performs a downgrade of the header to ensure its internal representation is bytes. Guess there must have been a good reason for that change...


In reply to Re^4: good way to implement utf8::is_utf8 for perl 5.8.0 by perl5ever
in thread good way to implement utf8::is_utf8 for perl 5.8.0 by perl5ever

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.