Re: Parse ISO 8601 date/times

Replies are listed 'Best First'.
Re^2: Parse ISO 8601 date/times by roboticus (Chancellor) on Nov 07, 2012 at 10:41 UTC
grizzley: I'm guessing he used `[0-9]` for visual symmetry with `[0-2]`, `[0-3]`, et. al. I was going to suggest that it would be easier to read, but when I converted a little bit from this: `&& $part !~ m{ # Time or partial time (or period): ^(?:\|P)T [012][0-9] (?:\| :?[0-5][0-9] (?:\| :?[0-5][0-9] ) ) (?:\| [.,][0-9]+ )$ }x` [download] to this: `&& $part !~ m{ # Time or partial time (or period): ^(?:\|P)T [012]\d (?:\| :?[0-5]\d (?:\| :?[0-5]\d ) ) (?:\| [.,]\d+ )$ }x` [download] I found that the better 'visual balance' of `[0-9]` was counterbalanced by the square brackets, which are a little too similar to vertical bars for my eyes. After looking at them both, I don't really have a preference--Perhaps if I had a better font... ...roboticus When your only tool is a hammer, all problems look like your thumb.	[reply] [d/l] [select]
Re^3: Parse ISO 8601 date/times by grizzley (Chaplain) on Nov 07, 2012 at 14:00 UTC
I agree. `[0-9]` is visually better in this case.	[reply] [d/l]
Re^2: Parse ISO 8601 date/times by choroba (Cardinal) on Nov 07, 2012 at 11:59 UTC
Oh really? `$d = chr(2413); print $d =~ $_, "\n" for qr/\d/, qr/[0-9]/;` [download] لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply] [d/l]
Re^3: Parse ISO 8601 date/times (never \d) by tye (Sage) on Nov 07, 2012 at 14:46 UTC
Yeah, I just don't ever use \d except for one-liners any more. \d now means something that I just never want: numerals of any kind, from any writing system. This despite Perl only knowing how to treat one of the two dozenish types of numerals as numeric. I think drastically changing the definition of \d when Unicode came along was a mistake (a separate way of saying "any numeral" should have been used). Luckily, the somewhat longer `[0-9]` has some visual advantages. So the worst problem is all of the old scripts that are now broken in ways that will often not matter (but that I can see even causing security problems in rare cases). - tye	[reply] [d/l]
Re^4: Parse ISO 8601 date/times (never \d) by tobyink (Canon) on Nov 07, 2012 at 17:46 UTC
The following pragma will "fix" `\d`. However, re::engine::Plugin does not currently support `s///` or `split //`, just matching. (And it doesn't support named captures either.) Still, it may be helpful for some. use 5.010; use strict; use utf8::all; BEGIN { package re::engine::SaneDigits; no thanks; use constant TAINT => ${^TAINT}; use re::engine::Plugin (); use Carp; sub import { re::engine::Plugin->import( comp => \&comp, exec => \&exec, ); } unimport = \&re::engine::Plugin::unimport; sub comp { my ($rx) = @_; my $real = $rx->pattern; $real =~ s{\\d}{[0-9]}g; $real =~ s{\\D}{[^0-9]}g; my %mods = my %mod = $rx->mod; my $mods = join q(), keys %mods; $real =~ s{/}{\/}g; $real = eval qq{ qr/$real/$mods }; $rx->stash({ real => $real }); $rx->num_captures( FETCH => sub { my ($rx, $paren) = @_; croak sprintf( "%s variable not supported with %s", { 0 => q($&), -1 => q($'), -2 => q($`) }->{$paren} +, __PACKAGE__, ) if $paren < 1; my $rv = $rx->stash->{last}[$paren]; return $rv unless TAINT; $rv =~ /(.)/; return $1; }, ); } sub exec { my ($rx, $str) = @_; my @results = ($str =~ $rx->stash->{real}); unshift @results, scalar pos; $rx->stash->{last} = \@results; return not defined $results[0]; } }; my $str = "foo23 bar5 bar42"; say $str =~ m/bar(\d+)/i ? "GOT $1" : "NO MATCH"; use re::engine::SaneDigits; say $str =~ m/bar(\d+)/i ? "GOT $1" : "NO MATCH"; [download] Update: Meh... come to think of it, a re::engine is overkill. Constant overloading does the trick much easier... use 5.010; use strict; use utf8::all; BEGIN { package re::SaneDigits; no thanks; use overload (); my %_const_handlers = (qr => \&_qr); my %_remove_handlers = map { $_ => undef } %_const_handlers; sub import { overload::constant %_const_handlers } sub unimport { overload::remove_constant %_remove_handlers } sub _qr { for (@_) { s/\\d/[0-9]/g; s/\\D/[^0-9]/g; return $_; } } }; my $str = "foo23 bar5 bar42"; say $str =~ m/bar(\d+)/i ? "GOT $1" : "NO MATCH"; use re::SaneDigits; say $str =~ m/bar(\d+)/i ? "GOT $1" : "NO MATCH"; [download] Another CPAN candidate I think. Update II: Looks like PerlMonks might be breaking my UTF8 again. The "5" character which appears in `$str` should not be a normal ASCII 5, but a fullwidth 5 (U+U+FF15), which is a character used to include an Arabic numeral 5 within CJK text. `perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'`	[reply] [d/l] [select]
Re^5: Parse ISO 8601 date/times (still never \d) by tye (Sage) on Nov 07, 2012 at 18:08 UTC
Re^6: Parse ISO 8601 date/times (still never \d) by tobyink (Canon) on Nov 07, 2012 at 18:34 UTC


Syntactic Confectionery Delight
	PerlMonks