G'day derion,

My first guess is that if /behälter/i is causing problems, but /beh\xE4lter/i is not, then using the utf8 pragma might be all you need.

Having said that, you've only provided code fragments. Parts that you've omitted may be important, e.g. how you call the open function. Please provide an SSCCE that we can run: you should keep this as short as possible while still showing the problem; also, please provide a short input file (probably only needs to be a few lines long).

The error you show, "Incorrect string value: '\xE4", contains an unexpected apostrophe: perhaps the actual error message, a typo, an SQL problem, or something else. Please paste verbatim program output within <code>...</code> tags, rather than typing by hand.

I generally prefer "\x{NN}" to "\xNN", as it removes any possible ambiguity (especially if NN is followed by other digits). I don't see a problem with that here, but it could be elsewhere: a little defensive programming never hurts.

And just a heads-up, "U+00E4 LATIN SMALL LETTER A WITH DIAERESIS" (ä) canonically decomposes into U+0061 (a) and U+0308 (¨). Again, I don't see that as an issue here, but maybe worth knowing about. See PDF "Unicode Code Chart: 0080 - 00ff".

— Ken


In reply to Re: Perl encoding problem by kcott
in thread Perl encoding problem by derion

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.