in reply to Can't find unicode character property definition via main-e or e.pl at unicode/Is/e.pl line 0

It turns out that the problem is related to the recently added regex feature

No, the problem is using a regex to match a literal string without telling it that this is what you wanted to do! The recent \p regex addition just made these bugs more noticeable.

If you ever see a regex with $var in it, you should suspect that \Q$var\E is what should have been used. Looking through Config.pm, I only find one (uncommented) case of this mistake, but it is probably fine since the $var in that case should only contain letters.

So are you the one coding regexen incorrectly this way? If so, the solution is quite simple. If you replace m!$rootb/(.*)/(\w+)\.p.+$!i; with m!\Q$rootb\E/(.*)/(\w+)\.p.+$!i;, for example, then that problem goes away.

Next you have to deal with your misconception that you can use a single character as the directory separator. I suggest you look into File::Spec instead of hand-rolled regexen for comparing paths.

        - tye (but my friends call me "Tye")
  • Comment on (tye)Re: Can't find unicode character property definition via main-e or e.pl at unicode/Is/e.pl line 0
  • Select or Download Code

Replies are listed 'Best First'.
Re: (tye)Re: Can't find unicode character property definition via main-e or e.pl at unicode/Is/e.pl line 0
by Rudif (Hermit) on Mar 19, 2001 at 02:38 UTC
    Thank you, tye, for the insight and advice.

    I met this problem recently while working on the original version of this script, from the module Pod-Tree-1.06. Path strings containing '\' were passed to the script as arguments by the user (me), and the script hit the error while doing the substitution
    $dir =~ s/^$PodDir/$HTMLDir/o;
    I did not understand the problem at the time, but I found that I could fix it by inserting
    # fix up the Win32 paths $PodDir =~ s( \\ )(/)gx; $HTMLDir =~ s( \\ )(/)gx;
    Now you explained the nature of the problem and showed two ways to make similar scripts more robust:
    • Apply quotemeta or \Q\E to a user-supplied string before using it in a regex.
    • File::Spec provides platform-independent operations on path strings, notably splitting, catenating and converting absolute paths to relative, and should be used in portable scripts.
    Rudif

Re^2: Can't find unicode character property definition via main-e or e.pl at unicode/Is/e.pl line 0
by rongoral (Beadle) on Nov 23, 2004 at 13:51 UTC
    Thank you for this answer Tye. I ran into this issue as well, except the error message contained "main-a or a.pl at unicode/Is/a.pl line 546". The <code>\Q\E character combination did the deal for me as well.
Re^2: Can't find unicode character property definition via main-e or e.pl at unicode/Is/e.pl line 0
by Anonymous Monk on Feb 18, 2009 at 16:38 UTC
    tye, I've run into this same item in unum.pl, which I'm trying to run on an + n810. $ unum c=`cat /home/user/hebrew.utf8` Can't find unicode character property definition via main->IsDigit or +Is/Digit.pl at /home/user/bin/unum.pl line 194 $ sed -ne '194p' /home/user/bin/unum.pl if ($n =~ m/^\d/) { I'm not familiar enough with PERL syntax to see how to apply the \Q\E. The line giving the error is: Line 194: if ($n =~ m/^\d/) { It doesn't seem right to me to change $n to \Q$n\E because it is not i +n the m// expression. At the same time, I don't see a $var in the m// expression. I'm feeling rather dense, as I look at this. What is the proper syntax for this line? John
      The line giving the error is:
      Line 194: if ($n =~ m/^\d/) {

      No, that line can't be to blame. Which means the problem is actually on a subsequent elsif line (that is an annoying quirk of how Perl reports line numbers for error messages -- an error in the conditional expression of an elsif is always reported as happening on the line of the corresponding if expression instead). If you don't see how to fix that line, then reply back.

      - tye        

        tye,

        Thank your for your quick response.
        I've tried to replace all of the $var's as you suggest, and I'm still getting the same error on the same line.

        The file enum.pl is from http://www.fourmilab.ch/webtools/unum/
        Here is the log showing the changes I tried.
        $ grep 'm\/.*\$.*\/' unum.pl if ($k->[2] =~ m/$cpat/) { if ($XHTML_ENTITIES{$k} =~ m/$cpat/) { if ($UNICODE_NAMES{$k} =~ m/$cpat/) { $ grep cpat unum.pl my $cpat = qr/$2/i; if ($k->[2] =~ m/$cpat/) { my $cpat = qr/$1/i; if ($XHTML_ENTITIES{$k} =~ m/$cpat/) { my $cpat = qr/$1/i; if ($UNICODE_NAMES{$k} =~ m/$cpat/) { $ nano unum.pl $ grep 'm\/.*\$.*\/' unum.pl if ($k->[2] =~ m/\Q$cpat\E/) { if ($XHTML_ENTITIES{$k} =~ m/\Q$cpat\E/) { if ($UNICODE_NAMES{$k} =~ m/\Q$cpat\E/) { $ grep '\/.*\$.*\/' unum.pl my $cpat = qr/$2/i; if ($k->[2] =~ m/\Q$cpat\E/) { my $cpat = qr/$1/i; if ($XHTML_ENTITIES{$k} =~ m/\Q$cpat\E/) { my $cpat = qr/$1/i; if ($UNICODE_NAMES{$k} =~ m/\Q$cpat\E/) { $n =~ s/^c?=(.+)$/$1/i; $ unum c=hello Can't find unicode character property definition via main->IsDigit or +Is/Digit.pl at /home/user/bin/unum.pl line 194 $ nano unum.pl $ unum c=hello Can't find unicode character property definition via main->IsDigit or +Is/Digit.pl at /home/user/bin/unum.pl line 194 $ nano unum.pl $ unum c=hello Can't find unicode character property definition via main->IsDigit or +Is/Digit.pl at /home/user/bin/unum.pl line 194 $ grep '\/.*\$.*\/' unum.pl my $cpat = qr/\Q$2\E/i; if ($k->[2] =~ m/\Q$cpat\E/) { my $cpat = qr/\Q$1\E/i; if ($XHTML_ENTITIES{$k} =~ m/\Q$cpat\E/) { my $cpat = qr/\Q$1\E/i; if ($UNICODE_NAMES{$k} =~ m/\Q$cpat\E/) { $n =~ s/^c?=(.+)$/\Q$1\E/i;
        Here are the "elsif" lines, they don't look guilty to me:
        $ cat unum.pl | grep 'if ' if ($#ARGV < 0) { if ($n =~ m/^\d/) { $n = oct($n) if ($n =~ m/^0/); } elsif ($n =~ m/^(b|l)=(.+)/) { if ($k->[2] =~ m/\Q$cpat\E/) { if (!$blocktitle) { if ($listall) { } elsif ($n =~ m/^h=(.+)/) { if ($XHTML_ENTITIES{$k} =~ m/\Q$cpat\E/) { } elsif ($n =~ m/^n=(.+)/) { if ($UNICODE_NAMES{$k} =~ m/\Q$cpat\E/) { } elsif ($n =~ m/^&#/) { if (!defined($u)) { if (defined($u)) { if (!$chartitle) { if ($code >= 0x4E00) { if ($code <= 0x9FFF || ($code >= 0xF900 && $code <= 0xFAFF)) { } elsif ($code >= 0xD800 && $code <= 0xF8FF) { if ($code <= 0xDFFF) { } elsif ($code >= 0xAC00 && $code <= 0xD7A3) { pop(@s) if $t == 0x11A7; return $block->[2] if $block->[0] <= $code && $block->[1] >= $ +code;
        john