in reply to Undecipherable t/re/uniprops02.t failures on recent builds of perl.
G'day Rob,
Note: I've used a common alias of mine below.
$ alias perle alias perle='perl -Mstrict -Mwarnings -Mautodie=:all -MCarp::Always -E +'
There are only a limited number of "Numeric_Value" values. They can be positive, negative and fractional but 3.12e-03 is not one of them. In perluniprops, a search for "Numeric_Value" finds 146 matches. Some examples (in spoiler):
$ perle 'use re "debug"; my $re = qr[\p{Numeric_Value: 3.12e-03}];' Compiling REx "\p{Numeric_Value: 3.12e-03}" Can't find Unicode property definition "Numeric_Value: 3.12e-03" in re +gex; marked by <-- HERE in m/\p{Numeric_Value: 3.12e-03} <-- HERE / a +t -e line 1. Freeing REx: "\p{Numeric_Value: 3.12e-03}" $ perle 'use re "debug"; my $re = qr[\p{Numeric_Value: 0.00312}];' Compiling REx "\p{Numeric_Value: 0.00312}" Can't find Unicode property definition "Numeric_Value: 0.00312" in reg +ex; marked by <-- HERE in m/\p{Numeric_Value: 0.00312} <-- HERE / at +-e line 1. Freeing REx: "\p{Numeric_Value: 0.00312}" $ perle 'use re "debug"; my $re = qr[\p{Numeric_Value: -1/2}];' Compiling REx "\p{Numeric_Value: -1/2}" Freeing REx: "\p{Numeric_Value: -1/2}" Final program: 1: EXACT_REQ8 <\x{f33}> (3) 3: END (0) anchored utf8 "%x{f33}" at 0..0 (checking anchored isall) minlen 1 Freeing REx: "\p{Numeric_Value: -1/2}" $ perle 'use re "debug"; my $re = qr[\p{Numeric_Value: 0}];' Compiling REx "\p{Numeric_Value: 0}" Final program: 1: ANYOF[0][0660 06F0 07C0 0966 09E6 0A66 0AE6 0B66 0BE6 0C66 0C78 +0CE6 0D66 0DE6 0E50 0ED0 0F20 1040 1090 17E0 17F0 1810 1946 19D0 1A80 + 1A90 1B50 1BB0 1C40 1C50 2070 2080 2189 24EA 24FF 3007 96F6 A620 A6E +F A8D0 A900 A9D0 A9F0 AA50 ABF0 F9B2 FF10 1018A 104A0 10D30...] (11) 11: END (0) stclass ANYOF[0][0660 06F0 07C0 0966 09E6 0A66 0AE6 0B66 0BE6 0C66 0C7 +8 0CE6 0D66 0DE6 0E50 0ED0 0F20 1040 1090 17E0 17F0 1810 1946 19D0 1A +80 1A90 1B50 1BB0 1C40 1C50 2070 2080 2189 24EA 24FF 3007 96F6 A620 A +6EF A8D0 A900 A9D0 A9F0 AA50 ABF0 F9B2 FF10 1018A 104A0 10D30...] min +len 1 Freeing REx: "\p{Numeric_Value: 0}" $ perle 'use re "debug"; my $re = qr[\p{Numeric_Value: 1/2}];' Compiling REx "\p{Numeric_Value: 1/2}" Final program: 1: ANYOF[\xBD][0B73 0D74 0F2A 2CFD A831 10141 10175-10176 109BD 109 +FB 10A48 10E7B 10F26 11FD1-11FD2 12464 1ECAE 1ED3C] (11) 11: END (0) stclass ANYOF[\xBD][0B73 0D74 0F2A 2CFD A831 10141 10175-10176 109BD 1 +09FB 10A48 10E7B 10F26 11FD1-11FD2 12464 1ECAE 1ED3C] minlen 1 Freeing REx: "\p{Numeric_Value: 1/2}" $ perle 'use re "debug"; my $re = qr[\p{Numeric_Value: 10}];' Compiling REx "\p{Numeric_Value: 10}" Final program: 1: ANYOFH[0BF0 0D70 1372 2169 2179 2469 247D 2491 24FE 277F 2789 27 +93 3038 3229 3248 3289 4EC0 5341 62FE F973 F9FD 10110 10149 10150 101 +57 10160-10164 102EA 10322 103D3 1085B 1087E 108AD 108FD 10917 109C9 +10A44 10A9E 10AED 10B5C 10B7C 10BAD 10CFC 10E69 10F22 10F52...] (Firs +t UTF-8 byte=E0-FF) (3) 3: END (0) stclass ANYOFH[0BF0 0D70 1372 2169 2179 2469 247D 2491 24FE 277F 2789 +2793 3038 3229 3248 3289 4EC0 5341 62FE F973 F9FD 10110 10149 10150 1 +0157 10160-10164 102EA 10322 103D3 1085B 1087E 108AD 108FD 10917 109C +9 10A44 10A9E 10AED 10B5C 10B7C 10BAD 10CFC 10E69 10F22 10F52...] (Fi +rst UTF-8 byte=E0-FF) minlen 1 Freeing REx: "\p{Numeric_Value: 10}"
See [PDF] "4.6 Numeric Value" from the Unicode specification.
Here's a quick script you can use to check code points:
#!/usr/bin/env perl use strict; use warnings; use Unicode::UCD 'charprop'; for my $char (0, 1, ' ', 'a', "\n") { print "Char '$char'\n"; my $code_point = ord $char; print 'Numeric_Type: ', charprop($code_point, 'Numeric_Type'), "\ +n"; print 'Numeric_Value: ', charprop($code_point, 'Numeric_Value'), +"\n"; print '-' x 40, "\n"; }
Output (in spoiler):
Char '0' Numeric_Type: Decimal Numeric_Value: 0 ---------------------------------------- Char '1' Numeric_Type: Decimal Numeric_Value: 1 ---------------------------------------- Char ' ' Numeric_Type: None Numeric_Value: NaN ---------------------------------------- Char 'a' Numeric_Type: None Numeric_Value: NaN ---------------------------------------- Char ' ' Numeric_Type: None Numeric_Value: NaN ----------------------------------------
That might be sufficient information for your needs. I was going to look in "lib/unicore/TestProp.pl" but I can't locate it: I tried https://github.com/Perl/perl5/tree/blead/lib/unicore, https://github.com/Perl/perl5/tree/maint-5.34/lib/unicore, and ran `find /home/ken/perl5/perlbrew/perls/ -iname TestProp.pl` on my computer. If you want, and provide a link, I'll be happy to check it out.
— Ken
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Undecipherable t/re/uniprops02.t failures on recent builds of perl.
by syphilis (Archbishop) on Apr 09, 2022 at 01:38 UTC |