I'm going to seriously simplify your question. Here's the code:
And here's the output:#!/usr/bin/perl use strict; use XML::Twig; binmode(STDOUT, ":utf8"); my $t= XML::Twig->new(); $t->set_keep_encoding; $t->parse(do { local $/; <DATA>}); $t->flush; exit 0; __END__ <?xml version="1.0" encoding="UTF-8"?> <harvest> <subject>Computation & Language</subject> <subject>Computer Science - Computation & Language</subject> </harvest>
And you want to change the &'s to &'s. The solution seems to be to remove the call to set_keep_encoding. When I remove that, the output becomes what you want. Whether that's a bug in the keep-encoding or the flush or whatever, I don't know. Hopefully mirod can help here ;-)<?xml version="1.0" encoding="UTF-8"?> <harvest><subject>Computation & Language</subject><subject>Computer Sc +ience - Computation & Language</subject></harvest>
Update: It appears I was a few minutes behind mirod on this. Oops. :-)
In reply to Re: XML::Twig::flush() and html/xml entities
by Tanktalus
in thread XML::Twig::flush() and html/xml entities
by mandarin
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |