andal has asked for the wisdom of the Perl Monks concerning the following question:

When reading perldoc perlunicode, I've stumbled over possibility to define my own character properties for matching. I decided to try this out as an interesting alternative to using ranges.

So here's the text from the pod.

Something to include, prefixed by "+": a built-in character property (prefixed by "utf8::") or a user-defined character property, to represent all the characters in that property; two hexadecimal code points for a range; or a single hexadecimal code point.

I've created the following test script

my $tst = "split-word another:one"; $tst =~ s/\p{IsSplitWord}+/**/g; print $tst,"\n"; sub IsMySep { return <<EOQ; 002D 003A 002F EOQ } sub IsSplitWord { return <<EOQ; 0041\t005A 0061\t007A +IsMySep EOQ }
The execution of it gives the error
SWASHNEW didn't return an HV ref at ./tester.pl line 3
What am I doing wrong?

Replies are listed 'Best First'.
Re: user defined character properties
by Corion (Patriarch) on Oct 27, 2010 at 08:20 UTC

    I think you're running into a Perl bug. Looking at the uni/class.t unicode test in Perl, it uses fully qualified names for its classes when specifying the intersection/inclusion:

    sub MyUniClass { <<END; 0030 004F END } sub Other::Class { <<END; 0040 005F END } sub A::B::Intersection { <<END; +main::MyUniClass &Other::Class END }

    I understand that the Unicode engine wants fully qualified names because it is hard to determine relative to where an arbitrary string specified relative names, but Perl shouldn't exit with an internal error but tell/warn you that it interpreted all relative class names as absolute below main:: or that it can't do that.

    Before you raise a bug for Perl, please check that you're running at least 12.2, as that bug might have been fixed in the meantime.

      Thanks. I won't report it as bug. It still can go as "poorly documented" :) The main thing, it works.

        Please help the community, and do report a bug. Even if a fully qualified name is mandatory, the error message is utterly useless for the majority of Perl programmers.