I have written a pretty straightforward XML::Twig-based script, but I am having a problem with the outputted XML. The XML comes in with the following declarations:
But, the result looks like:<!DOCTYPE ead PUBLIC "+//ISBN 1-931666-00-8//DTD ead.dtd (Encoded Arch +ival Description (EAD) Version 2002)//EN" "../ead_dtd/ead.dtd" [ <!ENTITY scrc_name SYSTEM "scrc_name.xml"> <!ENTITY su_address SYSTEM "su_address.xml"> <!ENTITY su_name SYSTEM "su_name.xml"> <!ENTITY subjindex SYSTEM "subjindex.xml"> <!ENTITY summitref SYSTEM "summit_ref.xml"> ]>
<!DOCTYPE ead PUBLIC "+//ISBN 1-931666-00-8//DTD ead.dtd (Encoded Arch +ival Description (EAD) Version 2002)//EN" "../ead_dtd/ead.dtd" [ <!ENTITY scrc_name "Special Collections Research Center"> <!ENTITY su_address '<address> <addressline>123 Elm St.<lb/></addressline> <addressline>Columbus, Oh 43021<lb/></addressline> </address> '> <!ENTITY su_name ... etc.
I have a tried every combination of options I can come up with for XML::Twig and XML::Parser that I can come up with, but haven't really gotten anywhere. The closest I've come is to switch to twig roots, with a combination of twig_print_outside_roots and keep_encoding, but that causes the Parser to bail out when it encounters an entity reference. The declarations do print out ~almost~ right, though--they are missing the square brackets.
I am running ActivePerl 5.8.8 on Windows with XML::Twig 1.26 and XML::Parser 2.34-r1.
Here is the code, with my most recently failed attempts at specifying options:
### my $twig_handlers = {'ead/archdesc/did/unitdate' => \&cont_break}; my $twig = new XML::Twig(TwigHandlers => $twig_handlers, expand_extern +al_ents => 0, NoExpand => 1, ExpandExternalEnts => 0, ParseParamEnt = +> 0); $twig->parsefile($xmlfile, NoExpand => 1, ExpandExternalEnts => 0, Par +seParamEnt => 0, expandEntityReferences => 0, SkipExternalDTD => 1); select XMLOUTPUT; $twig->print; # re-output the XML, with the normalized dates close XMLOUTPUT; ###
I really appreciate any help that anyone could lend.
UPDATE: I just tried this script with the development version of XML::Twig, 3.3.0, and it partially fixes the problem. The Entity Declarations are no longer being expanded, but the entity references still are.It seems like this is an XML::Parser issue, but I can't seem to come up with the right combination of options...
UPDATE 2: I was able to get everything working perfectly on linux by upgrading to XML::Twig and also to the Expat 2.0.1 libraries. Unfortunately, I am having a really hard time figuring out how to get the Windows Expat libraries working with XML::Parser. When I just swap in the new DLL's, I get an error about not finding the boot_XML__Parser__Expat symbol in Expat.dll.
UPDATE 3: mirod fixed this issue in the current development (3.30) version of XML::Twig. Thanks!
In reply to ignore XML entity declarations in XML::Twig? by cazzerson
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |