Johnny Five Toes has asked for the wisdom of the Perl Monks concerning the following question:

I have a line of code which appears to cause my unassigned utf8 program to hang:
$filename = "myfile"; print "\nThis program uses a real expression found on http://www.regul +ar-expressions.info/unicode.html#category"; open( my $fh, '< :encoding(UTF-8)', $filename ) or die "Cannot open $f +ilename: $!"; while ( my $line = <$fh> ) { if ( $line =~ /\p{Unassigned}+/ ) { print "$line\n"; } } print "\nThis program uses a real expression found on http://www.regul +ar-expressions.info/unicode.html#category"; close $fh;

The line "  if ( $line =~ /\p{Unassigned}+/ ) {" seems to be the culprit. When I replace the category expression with any other /\p expression the program runs fine. Is there extra code needed for the unassigned category?

Remember you are working with a newbie. Any help appreciate.

John

Replies are listed 'Best First'.
Re: unassigned utf8
by ikegami (Patriarch) on Jun 11, 2015 at 18:39 UTC
    Some of the character classes are not builtin to the perl binary but stored in external files (that are created when perl is installed). Maybe there's a problem loading these files??? Running the test under strace might help debug the problem.
    strace perl -e'" " =~ /\p{Unassigned}/'
Re: unassigned utf8
by stevieb (Canon) on Jun 11, 2015 at 17:01 UTC

    Welcome to PerlMonks!

    Please put your code, input data and output within <code>code goes here</code> tags. It makes it much easier to read and test your code, and your question will get much more attention.

    Cheers,

    -stevieb

    EDIT: for the time being, here's the OPs original post:

    Begin OP...

    I have a line of code which appears to cause my unassigned utf8 program to hang. The line if ( $line =~ /\p{Unassigned}+/ ) { seems to be the culprit. When I replace the category expression with any other /\p expression the program runs fine. Is there extra code needed for the unassigned category? Remember you are working with a newbie. Any help appreciate. John

    $filename = "myfile"; print "\nThis program uses a real expression found on http://www.regul +ar-expressions.info/unicode.html#category"; open( my $fh, '< :encoding(UTF-8)', $filename ) or die "Cannot open $filename: $!"; while ( my $line = <$fh> ) { if ( $line =~ /\p{Unassigned}+/ ) { print "$line\n"; } } print "\nThis program uses a real expression found on http://www.regul +ar-expressions.info/unicode.html#category"; close $fh;