Has anyone experience with differentiating character encoding's that ones needs to save pages that are returned by LWP::Curl?

I.e, lets say for simplicities' sake, that I want to be able able to specify a URL, and have it fetched into memory, and then saved.

I'm running into a bit of a dilemma -- when I try to treat the contents as UTF-8 -- that works fine for the pages I'm fetching (that happen to use the XHTML standard UTF-8), but it definitely doesn't work when I save binary files.

When I tried to fetch things as binary, that didn't work and I ended up with weird diamond-shape marks where quotes should be (a 'feature' of UTF-8 being misinterpreted as western).

Unfortunately, I can't tell the type from the file name, as some files are simply "site/get?item=xxx, where xxx could return text or an image.

I'm not at wits end on this yet, but in trying to trim down some verbose output, I hit on another problem that I've already posed a Q on here .. so while waiting for some ideas on that, I thought I'd try to pick people's brains a bit rather than just dash my head against documentation and various functions until I have a breakthrough or a headache and THEN end up here...(i.e. w/insight, I might save myself some time!) :-)

thanks...


In reply to LWP::Curl and character encoding by perl-diddler

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.