in reply to Undecipherable t/re/uniprops02.t failures on recent builds of perl.

G'day Rob,

Note: I've used a common alias of mine below.

$ alias perle alias perle='perl -Mstrict -Mwarnings -Mautodie=:all -MCarp::Always -E +'

There are only a limited number of "Numeric_Value" values. They can be positive, negative and fractional but 3.12e-03 is not one of them. In perluniprops, a search for "Numeric_Value" finds 146 matches. Some examples (in spoiler):

$ perle 'use re "debug"; my $re = qr[\p{Numeric_Value: 3.12e-03}];' Compiling REx "\p{Numeric_Value: 3.12e-03}" Can't find Unicode property definition "Numeric_Value: 3.12e-03" in re +gex; marked by <-- HERE in m/\p{Numeric_Value: 3.12e-03} <-- HERE / a +t -e line 1. Freeing REx: "\p{Numeric_Value: 3.12e-03}" $ perle 'use re "debug"; my $re = qr[\p{Numeric_Value: 0.00312}];' Compiling REx "\p{Numeric_Value: 0.00312}" Can't find Unicode property definition "Numeric_Value: 0.00312" in reg +ex; marked by <-- HERE in m/\p{Numeric_Value: 0.00312} <-- HERE / at +-e line 1. Freeing REx: "\p{Numeric_Value: 0.00312}" $ perle 'use re "debug"; my $re = qr[\p{Numeric_Value: -1/2}];' Compiling REx "\p{Numeric_Value: -1/2}" Freeing REx: "\p{Numeric_Value: -1/2}" Final program: 1: EXACT_REQ8 <\x{f33}> (3) 3: END (0) anchored utf8 "%x{f33}" at 0..0 (checking anchored isall) minlen 1 Freeing REx: "\p{Numeric_Value: -1/2}" $ perle 'use re "debug"; my $re = qr[\p{Numeric_Value: 0}];' Compiling REx "\p{Numeric_Value: 0}" Final program: 1: ANYOF[0][0660 06F0 07C0 0966 09E6 0A66 0AE6 0B66 0BE6 0C66 0C78 +0CE6 0D66 0DE6 0E50 0ED0 0F20 1040 1090 17E0 17F0 1810 1946 19D0 1A80 + 1A90 1B50 1BB0 1C40 1C50 2070 2080 2189 24EA 24FF 3007 96F6 A620 A6E +F A8D0 A900 A9D0 A9F0 AA50 ABF0 F9B2 FF10 1018A 104A0 10D30...] (11) 11: END (0) stclass ANYOF[0][0660 06F0 07C0 0966 09E6 0A66 0AE6 0B66 0BE6 0C66 0C7 +8 0CE6 0D66 0DE6 0E50 0ED0 0F20 1040 1090 17E0 17F0 1810 1946 19D0 1A +80 1A90 1B50 1BB0 1C40 1C50 2070 2080 2189 24EA 24FF 3007 96F6 A620 A +6EF A8D0 A900 A9D0 A9F0 AA50 ABF0 F9B2 FF10 1018A 104A0 10D30...] min +len 1 Freeing REx: "\p{Numeric_Value: 0}" $ perle 'use re "debug"; my $re = qr[\p{Numeric_Value: 1/2}];' Compiling REx "\p{Numeric_Value: 1/2}" Final program: 1: ANYOF[\xBD][0B73 0D74 0F2A 2CFD A831 10141 10175-10176 109BD 109 +FB 10A48 10E7B 10F26 11FD1-11FD2 12464 1ECAE 1ED3C] (11) 11: END (0) stclass ANYOF[\xBD][0B73 0D74 0F2A 2CFD A831 10141 10175-10176 109BD 1 +09FB 10A48 10E7B 10F26 11FD1-11FD2 12464 1ECAE 1ED3C] minlen 1 Freeing REx: "\p{Numeric_Value: 1/2}" $ perle 'use re "debug"; my $re = qr[\p{Numeric_Value: 10}];' Compiling REx "\p{Numeric_Value: 10}" Final program: 1: ANYOFH[0BF0 0D70 1372 2169 2179 2469 247D 2491 24FE 277F 2789 27 +93 3038 3229 3248 3289 4EC0 5341 62FE F973 F9FD 10110 10149 10150 101 +57 10160-10164 102EA 10322 103D3 1085B 1087E 108AD 108FD 10917 109C9 +10A44 10A9E 10AED 10B5C 10B7C 10BAD 10CFC 10E69 10F22 10F52...] (Firs +t UTF-8 byte=E0-FF) (3) 3: END (0) stclass ANYOFH[0BF0 0D70 1372 2169 2179 2469 247D 2491 24FE 277F 2789 +2793 3038 3229 3248 3289 4EC0 5341 62FE F973 F9FD 10110 10149 10150 1 +0157 10160-10164 102EA 10322 103D3 1085B 1087E 108AD 108FD 10917 109C +9 10A44 10A9E 10AED 10B5C 10B7C 10BAD 10CFC 10E69 10F22 10F52...] (Fi +rst UTF-8 byte=E0-FF) minlen 1 Freeing REx: "\p{Numeric_Value: 10}"

See [PDF] "4.6 Numeric Value" from the Unicode specification.

Here's a quick script you can use to check code points:

#!/usr/bin/env perl use strict; use warnings; use Unicode::UCD 'charprop'; for my $char (0, 1, ' ', 'a', "\n") { print "Char '$char'\n"; my $code_point = ord $char; print 'Numeric_Type: ', charprop($code_point, 'Numeric_Type'), "\ +n"; print 'Numeric_Value: ', charprop($code_point, 'Numeric_Value'), +"\n"; print '-' x 40, "\n"; }

Output (in spoiler):

That might be sufficient information for your needs. I was going to look in "lib/unicore/TestProp.pl" but I can't locate it: I tried https://github.com/Perl/perl5/tree/blead/lib/unicore, https://github.com/Perl/perl5/tree/maint-5.34/lib/unicore, and ran `find /home/ken/perl5/perlbrew/perls/ -iname TestProp.pl` on my computer. If you want, and provide a link, I'll be happy to check it out.

— Ken

Replies are listed 'Best First'.
Re^2: Undecipherable t/re/uniprops02.t failures on recent builds of perl.
by syphilis (Archbishop) on Apr 09, 2022 at 01:38 UTC
    Thanks Ken.
    The info you've provided looks very helpful, and I'm about to start utilising it in investigating further.
    It's probably just some bug in the way doubledoubles are assigned (or being read), but I think it's about time I at least worked out what is going on.

    I was going to look in "lib/unicore/TestProp.pl" but I can't locate it

    Comments at the start of TestProp.pl inform us that it is machine-generated by ..\lib\unicore\mktables from the Unicode database, Version 14.0.0.
    I guess this would be done during the "make" (or perhaps "make test") stage.

    I'll post again later - once I've done some digging.

    Cheers,
    Rob