Beware that binary comparison of strings does not work on Unicode. Even length() doesn't work. Well, it depends what type of comparison you want.

If you want to compare if the strings are exactly the same bit-by-bit, your stuff works. If you need to compare if the characters are functionally equivalent, that's a different matter.

As discussed before, many Unicode characters can be encoded in more than one way, see Re^2: incorrect length of strings with diphthongs

Further, even if you have one character that's exactly the same Unicode character in both strings, they still might be displayed completely different. Aside from the Umlaut stuff, you have other modifiers as well, like skin tone for Emoji, right-to-left stuff. And also there some scripts like Arab where the display of a character changes depending on the characters around it.

And what's extra nice about about modifier characters is that it may or may not modify your debug prints:

my $modifier = chr(0x200F); print "The character $modifier is only in string 1\n";

PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP

In reply to Re^2: Compare two strings of same length, character-by-character by cavac
in thread Compare two strings of same length, character-by-character by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.