At first, I thought that might get too low-level for me. But then I saw that XML::Rules->new has a handlers => {...} which allows defining handlers for XML::Parser::Expat events. Some experimentation with a dummy callback to handle them all says that Comment and XMLDecl are the events I want during the parsing.
#!perl
use 5.012; # strict, //
use warnings;
use Data::Dump;
use XML::Rules;
my $xml_doc = <<EOXML;
<?xml version="1.0" encoding="UTF-8" ?>
<!-- important instructions to manual editors -->
<root>
<group name="blah">
<!-- important instructions for group "blah" -->
<tag/>
</group>
<group name="second">
<!-- important instructions for group "second" -->
<differentTag/>
</group>
</root>
EOXML
my $callback = sub {
my ($name, $parser, @args) = @_;
print STDERR "event:", $name//'<undef>', "(";
print STDERR join ', ', map {defined($_) ? qq("$_") : '<undef>'} @
+args;
print STDERR ")\n";
};
my %handlers = ();
for my $h ( qw/Comment XMLDecl/ ) { #qw/Start End Char Proc Comment Cd
+ataStart CdataEnd Default Unparsed Notation ExternEnt ExternEntFin En
+tity Element Attlist Doctype DoctypeFin XMLDecl/) {
$handlers{$h} = sub { $callback->($h => @_) }
}
my $parser = XML::Rules->new(
stripspaces => 3|4,
rules => [
_default => 'raw',
],
handlers => \%handlers,
);
#dd
my $data = $parser->parse($xml_doc);
print
my $out = $parser->ToXML($data, 0, " ", "") . "\n";
__DATA__
event:XMLDecl("1.0", "UTF-8", <undef>)
event:Comment(" important instructions to manual editors ")
event:Comment(" important instructions for group "blah" ")
event:Comment(" important instructions for group "second" ")
<root>
<group name="blah">
<tag/>
</group>
<group name="second">
<differentTag/>
</group>
</root>
Now that I've got that far, I should be able to get the prolog and comments into the data object (by returning values, instead of just printing messages). But the harder part will be how to get ->ToXML() to do something on the output. I may have to subclass XML::Rules to get additional outputs for my comment and prolog data items -- if anyone has an easier idea than that, feel free to let me know. |