Thank you! You were right, one of the things I forgot to mention was that I was doing a substitution later to remove any punctuation characters, and it was:

s/[^\w\s]//g


I didn't realize that accented letters didn't count in the \w match, I figured they were still alphanumeric. Well that's kind of annoying. I was using that to normalize the string and remove anything like commas and semicolons. Now I have to make a list of all the characters I want to remove, instead of being able to just specify the ones I want to keep. Oh well, at least we've found the problem. Still though, is there some way to convert accented letters to just remove the accent and keep the letter? It would seem a better solution than listing out everything that is not a letter, number, space, or letter with an accent.

In reply to Re^4: keeping diacritical marks in a string by Foxpond Hollow
in thread keeping diacritical marks in a string by Foxpond Hollow

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.