in reply to Re^3: Something strange in the world or Regexes
in thread Something strange in the world or Regexes

way outside the opener, but related:

How will(/are) shell LANG/LC_* variables be handled? I'm 90% convinced that is sanest to ignore those.

Esp. this heresy:

There's that painful POSIXishly sick but "officially correct" problem of many utf8 locales having suddenly rather strange collating sequences, making a mess of the most trivial shell patterns like e.g. A-Z* in bash for e.g. de_DE.utf8 or en_US.utf8. Seeing a A-Z* glob suddenly match thisshouldnotmatchbutdoesgeethanxposix nearly made me return to bed hoping for the nightmare to stop. It required finding the antidote of LC_COLLATE=C to recover.

Now while I place some trust that regexes won't fall victim to that collation malsequencing insanity, what about perl's glob patterns?

  • Comment on Re^4: LC_*: Something horrible in the world of Regexes

Replies are listed 'Best First'.
Re^5: Something horrible in the world of Regexes - attack of the posix zombies
by ikegami (Patriarch) on Sep 30, 2009 at 18:14 UTC

    How will(/are) shell LANG/LC_* variables be handled? I'm 90% convinced that is sanest to ignore those.

    Those are definitely ignored if you don't use locale. That's about all I know.

    use open ':std', ':locale'; is useful if you want to use the locale's encoding (and nothing else) for STD*. ref: open