(That assumes a unix environment -- it uses bash-style quoting, and the unix "od" command to view in detail what character codes are printed as output.) The output is:perl -e '$_ = join "", map { chr() } 0..255; s/\W//g; print' | od -txC + -a
As advertised in the perlre man page, the only things that do not match "\W" are digits, letters and underscore.0000000 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 + 46 0 1 2 3 4 5 6 7 8 9 A B C D E + F 0000020 47 48 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 55 + 56 G H I J K L M N O P Q R S T U + V 0000040 57 58 59 5a 5f 61 62 63 64 65 66 67 68 69 6a + 6b W X Y Z _ a b c d e f g h i j + k 0000060 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a + l m n o p q r s t u v w x y z + 0000077
If you're willing to tread into utf8 wide-character territory, there are lots of options -- things that don't match \W but that probably would not be considered "alphanumeric" by most people. Compared to the one-liner above, I was actually a little surprisedby the effect of this relatively small difference:
(That assumes a utf8-capable terminal window. Redirect to files and use other display methods if necessary to really see what's happening.) The vastly different behavior based on a seemingly minor difference in the loop iterator relates to this quote from a message that Larry Wall posted a while back on the perl-unicode mail list:perl -CS -e '$_ = join "", map { chr() } 0..255; s/\W//g; print $_,$/' # (produces the same output as the first one-liner above) perl -CS -e '$_ = join "", map { chr() } 0..256; s/\W//g; print $_,$/' # (produces very different output)
Perl's always been about providing reasonable defaults, and will continue to do so. But changing what's reasonable is tricky, and sometimes you have to go through a period in which nothing can be considered reasonable.Anyway, I'll second Joost's comment: it's not clear why you are asking this question -- maybe there's a better way to do what needs to be done, but you haven't told us what you're really trying to do.
(updated to fix spelling errors)
In reply to Re: Text delimiter
by graff
in thread Text delimiter
by Gavin
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |