It sounds to me like perhaps some of your strings were not decoded properly when you loaded them into Perl. Note that you can still provide an SSCCE: at the very least inspect your strings (and post them here) using Data::Dumper with $Data::Dumper::Useqq=1; or with Data::Dump, or even better, use hexdump or od to show your input files, and Devel::Peek for the strings; I gave an example here. As for posting on PerlMonks, you can post Unicode as long as you put it in <pre> instead of <code> tags (you'll have to escape <, >, and & manually though).

I do not have a "use utf8;" in place in this script, because if I add it, then it screws up nearly all the UTF-8 characters

That's strange, since utf8 only affects how your source code is interpreted. If you have any non-ASCII characters in your source, then I'd strongly recommend to make sure the file is properly encoded as UTF-8 and then use utf8;. To look at the source file and verify its encoding, you might also be interested in my script enctool.

And as kcott said, this also may depend on the Perl version you're using, for example, there's The 'unicode_strings' feature.


In reply to Re: Safely removing Unicode zero-width spaces and other non-printing characters by haukex
in thread Safely removing Unicode zero-width spaces and other non-printing characters by mldvx4

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.