in reply to Re: Problems with string concatenation and encodings
in thread Problems with string concatenation and encodings

Hi Mirod,

thanks for answering my post. To start, here is what I get when I run your 'ts' command:

This is perl, v5.6.1 built for i686-linux XML::Twig: 2.02 XML::Parser: 2.30 bash: xmlwf: command not found

So maybe this is a problem which only occurs with older versions of Twig? Is there anything I have to change in my code if I upgrade to the newest version?

I didn't mention Twig in the title of the post because I thought this is a problem with concatenating strings that have different encodings, not something which is specific to Twig. I thought the source of the problem is probably that the attribute values are in UTF-8, but the text was in the original encoding (Latin1), and the checks I did seemed to confirm this. I had had similar experiences with XML::LibXML, so I thought, this is probably a normal behavior.

Still, my question remains: is there any way to find the encoding of a (perl) string? This would help me a lot to avoid similar problems in future.

Thanks,

pike

Replies are listed 'Best First'.
Re: Re: Re: Problems with string concatenation and encodings
by mirod (Canon) on Dec 13, 2002 at 12:17 UTC

    Whaouh! XML::Twig 2.02 is pretty old! You should definitely upgrade, provided you dont (ab)use tricks like including mark-up in the text of elements (see the Changes file): provided you use the keep_encoding option, the new version will get the attributes in the original encoding, not in UTF-8.

    As far as guessing the encoding of a string, you can try Encode::Guess, but it might not work with 5.6.1 (it is part of 5.8.0 core) and, as stated by the author Use this module with care.