Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: XML::Twig handlers weirdness?

by benizi (Hermit)
on Feb 21, 2006 at 18:59 UTC ( [id://531769]=note: print w/replies, xml ) Need Help??


in reply to XML::Twig handlers weirdness?

I call perl bug. The following also demonstrates the odd behavior. Interestingly, the substr(lc($str),0) in the second iteration is limited to the length of the (correct) substr(lc($str),0) in the first iteration.

e.g. with an argument of 'a:bc', the 'bc' is cut down to 'b' (the length of 'a'). For 'ab:cde' or 'ab:cdefgh', the 'cde' and 'cdefgh' are cut down to 'cd' (the length of 'ab').

#!/usr/bin/perl -l use strict; use warnings; use Encode qw/_utf8_on/; for my $str (split /:/, shift||'a:bc') { _utf8_on($str); print "$str\t", substr(lc($str), 0); # use Devel::Peek; Dump substr(lc($str),0); }

For someone familiar w/ perlguts (not me), uncomment the Devel::Peek line.

UPDATE: Expected output for input of x:yz is:

x x yz yz
, but due to bugginess, it's:
x x yz y

Also, the problem presents in v5.8.7 linux, but not in v5.8.0 solaris, if those are helpful data points.

Replies are listed 'Best First'.
Re^2: XML::Twig handlers weirdness?
by mirod (Canon) on Feb 21, 2006 at 20:14 UTC

    Nice!

    So indeed it looks like something linked to unicode. The strings that compose the path in XML::Twig come directly from XML::Parser, so they have been utf-8'ed somewhere in expat or XML::Parser, hence the bug shows its ugly head. It's weird to get problems with basic ascii characters though.

    Incidently 5.8.0 and 5.8.1-8 are fairly different in their unicode support, so I am not surprised that they behave differently.

    In any case, I think I'm off the hook for this one, so thanks! :--)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://531769]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2024-04-19 22:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found