in reply to ISO 8859-1 characters and \w \b etc.

Here's the answer, and it's a bit confusing. A perl string has a magic bit attached to it, the UTF8 bit. If it is off, your string is assumed to be in latin1. That's fairly clear. What's not clear is that when the string is a bunch of utf8 chars, ö is considered a letter (for example), but when it's latin1 characters, ö is not a letter (unless using locale).

The solution is to make your strings utf8 strings, by using Encode.


Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).