(tye)Re: Can't find unicode character property definition via main-e or e.pl at unicode/Is/e.pl line 0

It turns out that the problem is related to the recently added regex feature

No, the problem is using a regex to match a literal string without telling it that this is what you wanted to do! The recent \p regex addition just made these bugs more noticeable.

If you ever see a regex with $var in it, you should suspect that \Q$var\E is what should have been used. Looking through Config.pm, I only find one (uncommented) case of this mistake, but it is probably fine since the $var in that case should only contain letters.

So are you the one coding regexen incorrectly this way? If so, the solution is quite simple. If you replace m!$rootb/(.*)/(\w+)\.p.+$!i; with m!\Q$rootb\E/(.*)/(\w+)\.p.+$!i;, for example, then that problem goes away.

Next you have to deal with your misconception that you can use a single character as the directory separator. I suggest you look into File::Spec instead of hand-rolled regexen for comparing paths.

- tye (but my friends call me "Tye")

Comment on (tye)Re: Can't find unicode character property definition via main-e or e.pl at unicode/Is/e.pl line 0 Select or Download Code

Replies are listed 'Best First'.
Re: (tye)Re: Can't find unicode character property definition via main-e or e.pl at unicode/Is/e.pl line 0 by Rudif (Hermit) on Mar 19, 2001 at 02:38 UTC
Thank you, tye, for the insight and advice. I met this problem recently while working on the original version of this script, from the module Pod-Tree-1.06. Path strings containing `'\'` were passed to the script as arguments by the user (me), and the script hit the error while doing the substitution `$dir =~ s/^$PodDir/$HTMLDir/o;` [download] I did not understand the problem at the time, but I found that I could fix it by inserting `# fix up the Win32 paths $PodDir =~ s( \\ )(/)gx; $HTMLDir =~ s( \\ )(/)gx;` [download] Now you explained the nature of the problem and showed two ways to make similar scripts more robust: Apply quotemeta or \Q\E to a user-supplied string before using it in a regex. File::Spec provides platform-independent operations on path strings, notably splitting, catenating and converting absolute paths to relative, and should be used in portable scripts. Rudif	[reply] [d/l] [select]
Re^2: Can't find unicode character property definition via main-e or e.pl at unicode/Is/e.pl line 0 by rongoral (Beadle) on Nov 23, 2004 at 13:51 UTC
Thank you for this answer Tye. I ran into this issue as well, except the error message contained "main-a or a.pl at unicode/Is/a.pl line 546". The <code>\Q\E character combination did the deal for me as well.	[reply]
Re^2: Can't find unicode character property definition via main-e or e.pl at unicode/Is/e.pl line 0 by Anonymous Monk on Feb 18, 2009 at 16:38 UTC
tye, I've run into this same item in unum.pl, which I'm trying to run on an + n810. $ unum c=`cat /home/user/hebrew.utf8` Can't find unicode character property definition via main->IsDigit or +Is/Digit.pl at /home/user/bin/unum.pl line 194 $ sed -ne '194p' /home/user/bin/unum.pl if ($n =~ m/^\d/) { I'm not familiar enough with PERL syntax to see how to apply the \Q\E. The line giving the error is: Line 194: if ($n =~ m/^\d/) { It doesn't seem right to me to change $n to \Q$n\E because it is not i +n the m// expression. At the same time, I don't see a $var in the m// expression. I'm feeling rather dense, as I look at this. What is the proper syntax for this line? John [download]	[reply] [d/l]
Re^3: Can't find unicode character property definition at line $X (elsif) by tye (Sage) on Feb 18, 2009 at 16:54 UTC
The line giving the error is: `Line 194: if ($n =~ m/^\d/) {` [download] No, that line can't be to blame. Which means the problem is actually on a subsequent `elsif` line (that is an annoying quirk of how Perl reports line numbers for error messages -- an error in the conditional expression of an `elsif` is always reported as happening on the line of the corresponding `if` expression instead). If you don't see how to fix that line, then reply back. - tye	[reply] [d/l]
Re^4: Can't find unicode character property definition at line $X (elsif) by Anonymous Monk on Feb 18, 2009 at 18:14 UTC
tye, Thank your for your quick response. I've tried to replace all of the $var's as you suggest, and I'm still getting the same error on the same line. The file enum.pl is from http://www.fourmilab.ch/webtools/unum/ Here is the log showing the changes I tried. $ grep 'm\/.\$.\/' unum.pl if ($k->[2] =~ m/$cpat/) { if ($XHTML_ENTITIES{$k} =~ m/$cpat/) { if ($UNICODE_NAMES{$k} =~ m/$cpat/) { $ grep cpat unum.pl my $cpat = qr/$2/i; if ($k->[2] =~ m/$cpat/) { my $cpat = qr/$1/i; if ($XHTML_ENTITIES{$k} =~ m/$cpat/) { my $cpat = qr/$1/i; if ($UNICODE_NAMES{$k} =~ m/$cpat/) { $ nano unum.pl $ grep 'm\/.\$.\/' unum.pl if ($k->[2] =~ m/\Q$cpat\E/) { if ($XHTML_ENTITIES{$k} =~ m/\Q$cpat\E/) { if ($UNICODE_NAMES{$k} =~ m/\Q$cpat\E/) { $ grep '\/.\$.\/' unum.pl my $cpat = qr/$2/i; if ($k->[2] =~ m/\Q$cpat\E/) { my $cpat = qr/$1/i; if ($XHTML_ENTITIES{$k} =~ m/\Q$cpat\E/) { my $cpat = qr/$1/i; if ($UNICODE_NAMES{$k} =~ m/\Q$cpat\E/) { $n =~ s/^c?=(.+)$/$1/i; $ unum c=hello Can't find unicode character property definition via main->IsDigit or +Is/Digit.pl at /home/user/bin/unum.pl line 194 $ nano unum.pl $ unum c=hello Can't find unicode character property definition via main->IsDigit or +Is/Digit.pl at /home/user/bin/unum.pl line 194 $ nano unum.pl $ unum c=hello Can't find unicode character property definition via main->IsDigit or +Is/Digit.pl at /home/user/bin/unum.pl line 194 $ grep '\/.\$.\/' unum.pl my $cpat = qr/\Q$2\E/i; if ($k->[2] =~ m/\Q$cpat\E/) { my $cpat = qr/\Q$1\E/i; if ($XHTML_ENTITIES{$k} =~ m/\Q$cpat\E/) { my $cpat = qr/\Q$1\E/i; if ($UNICODE_NAMES{$k} =~ m/\Q$cpat\E/) { $n =~ s/^c?=(.+)$/\Q$1\E/i; [download] Here are the "elsif" lines, they don't look guilty to me: $ cat unum.pl \| grep 'if ' if ($#ARGV < 0) { if ($n =~ m/^\d/) { $n = oct($n) if ($n =~ m/^0/); } elsif ($n =~ m/^(b\|l)=(.+)/) { if ($k->[2] =~ m/\Q$cpat\E/) { if (!$blocktitle) { if ($listall) { } elsif ($n =~ m/^h=(.+)/) { if ($XHTML_ENTITIES{$k} =~ m/\Q$cpat\E/) { } elsif ($n =~ m/^n=(.+)/) { if ($UNICODE_NAMES{$k} =~ m/\Q$cpat\E/) { } elsif ($n =~ m/^&#/) { if (!defined($u)) { if (defined($u)) { if (!$chartitle) { if ($code >= 0x4E00) { if ($code <= 0x9FFF \|\| ($code >= 0xF900 && $code <= 0xFAFF)) { } elsif ($code >= 0xD800 && $code <= 0xF8FF) { if ($code <= 0xDFFF) { } elsif ($code >= 0xAC00 && $code <= 0xD7A3) { pop(@s) if $t == 0x11A7; return $block->[2] if $block->[0] <= $code && $block->[1] >= $ +code; [download] john	[reply] [d/l] [select]
Re^5: Can't find unicode character property definition at line $X (elsif) by tye (Sage) on Feb 18, 2009 at 18:40 UTC
Re^6: Can't find unicode character property definition at line $X (elsif) by Anonymous Monk on Feb 18, 2009 at 23:18 UTC