in reply to Re: Parse ISO 8601 date/times
in thread Parse ISO 8601 date/times
$d = chr(2413); print $d =~ $_, "\n" for qr/\d/, qr/[0-9]/;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Parse ISO 8601 date/times (never \d)
by tye (Sage) on Nov 07, 2012 at 14:46 UTC | |
Yeah, I just don't ever use \d except for one-liners any more. \d now means something that I just never want: numerals of any kind, from any writing system. This despite Perl only knowing how to treat one of the two dozenish types of numerals as numeric. I think drastically changing the definition of \d when Unicode came along was a mistake (a separate way of saying "any numeral" should have been used). Luckily, the somewhat longer [0-9] has some visual advantages. So the worst problem is all of the old scripts that are now broken in ways that will often not matter (but that I can see even causing security problems in rare cases). - tye | [reply] [d/l] |
by tobyink (Canon) on Nov 07, 2012 at 17:46 UTC | |
The following pragma will "fix" \d. However, re::engine::Plugin does not currently support s/// or split //, just matching. (And it doesn't support named captures either.) Still, it may be helpful for some.
Update: Meh... come to think of it, a re::engine is overkill. Constant overloading does the trick much easier...
Another CPAN candidate I think. Update II: Looks like PerlMonks might be breaking my UTF8 again. The "5" character which appears in $str should not be a normal ASCII 5, but a fullwidth 5 (U+U+FF15), which is a character used to include an Arabic numeral 5 within CJK text.
perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
| [reply] [d/l] [select] |
by tye (Sage) on Nov 07, 2012 at 18:08 UTC | |
Wow, that's a lot of machinery to avoid running :s/\\d/[0-9]/gc in your editor. I think I'll always choose to avoid the recurring cost. And I'm mostly not talking about CPU cost (but I suspect that is non-trivial), but the cost of things like mentally having to track new rules about how changing m// to s/// or split() requires extra attention, having to track that \d means different things in different places, having to search for pragmas each time I see \d inside m// if I care which meaning it has, the risk of having to debug the chain of code required to support this, etc. The risk of just wasting time because of a bug in the added pile of code required to support this is my biggest concern (after having repeatedly been burned by such things), especially when I consider the risk of this idea of pretending \d isn't \d getting in the way of some other tricky module's reasonable-sounding assumptions. Just because something is possible doesn't mean it is a good idea. :) - tye | [reply] [d/l] |
by tobyink (Canon) on Nov 07, 2012 at 18:34 UTC | |