in reply to inconsistency in whitespace handling

Erm, strictly speaking ASCII is characters 0x0-0x7f. 0xa0 is (I believe) Latin-1 and wouldn't be matched as \s unless it was in a Unicode string.

  • Comment on Re: inconsistency in whitespace handling

Replies are listed 'Best First'.
Re^2: inconsistency in whitespace handling
by bart (Canon) on May 12, 2005 at 14:25 UTC
    The meaning of chr(160) shouldn't change between Latin-1 and UTF-8, despite the different representation as bytes. It is the same character, Latin-1 being a subset of Unicode.

      Right, but my point was that 0xa0 isn't considered a space character for a plain vanilla ASCII scalar without the utf magic enabled (underneath Perl's calling isspace(3), which only considers the characters space, form-feed, newline, carriage return, horizontal tab, and vertical tab to be whitespace).