in reply to nbsp in space character class
(\s also matches it, but only if perl has the string marked as utf8). But \p{Zs} expects an actual NO-BREAK SPACE character, not the HTML entity for one. If you want to include entities that represent space characters, you'd probably be best off using HTML::Entities to decode them first.$ perl -le'use charnames (); use warnings; no warnings "utf8"; chr($_) +=~/[\p{Zs}]/ && printf "%.4x: %s\n", $_, charnames::viacode($_) for 0 +..65500' 0020: SPACE 00a0: NO-BREAK SPACE 1680: OGHAM SPACE MARK 180e: MONGOLIAN VOWEL SEPARATOR 2000: EN QUAD 2001: EM QUAD 2002: EN SPACE 2003: EM SPACE 2004: THREE-PER-EM SPACE 2005: FOUR-PER-EM SPACE 2006: SIX-PER-EM SPACE 2007: FIGURE SPACE 2008: PUNCTUATION SPACE 2009: THIN SPACE 200a: HAIR SPACE 202f: NARROW NO-BREAK SPACE 205f: MEDIUM MATHEMATICAL SPACE 3000: IDEOGRAPHIC SPACE
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: nbsp in space character class
by ikegami (Patriarch) on Jul 14, 2008 at 05:57 UTC |