in reply to XML Cleanup

There are many ways to do this, but the one I usually prefer is to take advantage of XML entities: what I parse is <!DOCTYPE doc [<!ENTITY real_doc SYSTEM "$doc_file">]><doc>&real_doc;</doc>. The XML string references the real file through the entity.

This way you don't need a temporary file, and you don't touch your original data. Plus you get XML cred, amaze your friends, impress your boss... why would you do it any other way?

You can even check that it works by running this code, that tests it both with XML::Parser and with XML::LibXML:

#!/usr/bin/perl use strict; use warnings; use XML::Parser; use XML::LibXML; my $doc_file= shift @ARGV; my $xml=qq{<!DOCTYPE doc [<!ENTITY real_doc SYSTEM "$doc_file">]><doc> +&real_doc;</doc>}; { print "XML::Parser:\n"; my $t= XML::Parser->new( Style => 'Stream')->parse( $xml); } { print "XML::LibXML:\n"; my $parser = XML::LibXML->new(); my $doc = $parser->parse_string($xml); print $doc->toString; }

Replies are listed 'Best First'.
Re^2: XML Cleanup
by Your Mother (Archbishop) on May 23, 2008 at 19:55 UTC

    Neat trick!