AidanLee has asked for the wisdom of the Perl Monks concerning the following question:

I have written a few modules that use XML::Parser. I also want them to use utf8; so that my code will be unicode friendly (for at least one of these, it is necessary, as I it stores and retrieves translation tables). But for some reason one of my handlers for the xml parser seems to be giving the utf8 code issues:

Can't locate object method "IsSpace" via package "main" at C:/Perl/lib/utf8_heavy.pl line 30.

i traced it back to where it's getting invoked from my code, and it's the last line of my character handler:

sub _parseChar { my( $parser,$string ) = @_; if( $state_stack[-1] =~ /dataType|label|op|sep/ ) { $value .= $string; } else { csWarn( class=>csErrorClass('CODE_PARAMS'),severity=>csErrorSc +ale('MODERATE'), message => 'illegal syntax in Resource File', debug => "illegal syntax: bare string '$string' out of + context") and return undef unless $string =~ /\s+/; <=== here } }
I am not sure why this is happening... any ideas?

Replies are listed 'Best First'.
(tye)Re: XML::Parser and the utf8 Pragma
by tye (Sage) on May 11, 2001 at 01:59 UTC

    If a $SIG{__DIE__} handler [yuck, ptuey] is reporting something that isn't showing elsewhere, then I'll bet you a quadword that your handler isn't checking $^S and is reporting errors that someone is dealing with using eval and so they shouldn't be considered a problem.

    Just fix your $SIG{__DIE__} handler, or (much better), replace it with an eval {...} around the heart of your script followed by logging $@.

            - tye (but my friends call me "Tye")
      I'll look into it, thanks. Why the issues with sig handlers? My intention, by the way, is to avoid using "die" statements as much as possible. The die and warn handlers i've installed for this set of scripts is to catch anything i hadn't thought to catch, or is out of my control (like when xml parser craps out on you... not happy there isn't a hook into it's error handling abilities). The idea for the former, of course, is to go in and take care of the exceptions.

        Well, real signal handlers have their own problems that I won't go into here. One problem with both __WARN__ and __DIE__ handlers is that they are single-slot globals so if you want to use them then you had better hope that none of the modules you use had a use for them either or you are going to "bump heads".

        Other than the above, __WARN__ handlers don't bother me and are an okay way to catching warnings. If you have new enough Perl, then I think you can instead use warnings.

        If you just want your warnings and errors to end up in a log file, then I'd simply redirect STDERR and avoid all this fancy stuff!

        __DIE__ handlers on the other hand have several design problems and the alternative, eval and $@, don't have these problems, so I always encourage the use of eval {...} in place of the __DIE__ handler. __DIE__ handlers also look deceptively easy so people often choose them over eval.

        Some of the problems with __DIE__ handlers: They get called even inside eval so you run the risk of reporting errors that you shouldn't or of making non-fatal non-errors into fatal errors. Also, if you use eval (instead of a handler), you don't have these problem and you also don't have the problem of "bumping heads" with other modules that want to use eval.

        Also, eval "..." has a (partially deserved) reputation for being "slow" so people are often shy to use eval {...}.

                - tye (but my friends call me "Tye")
Re: XML::Parser and the utf8 Pragma
by AidanLee (Chaplain) on May 10, 2001 at 21:09 UTC

    a bit more info. when running this test script under -w

    use strict; use utf8; use XML::Parser; my $string = ''; $string =~ /\s+/; 1;
    it doesn't complain, but if i feed it through my error handlers (overriding $SIG{__DIE__}) the error appears in my log files.
Re: XML::Parser and the utf8 Pragma
by mirod (Canon) on May 10, 2001 at 23:30 UTC

    From what I got from p5p, unicode support is still not quite stable in Perl. Which version are you using? 5.6.0? It might be worth trying 5.6.1 and maybe 5.7.0.

      that is definitely a possibility. I am aware of unicode's unstable nature from the p5p digest myself. I'm running a new-ish version of ActiveState.

      Does anyone think this would be worth posting to the appropriate authorities as a bug?