I used to go with XML::Twig, but for some reason I started to get annoyed with it. I guess it's a matter of taste. I always use XML::LibXML anymore. It's a bit harder, maybe, to get started with, though. For example, if you look in the SYNOPSIS of its docs, it's difficult to know how to use it. You have to know its classes are (nicely) broken down mostly like the XML DOM, so you sometimes have to search for docs.
Here's maybe a quickstart guide, at least how I use it (I usually encapsulate parts of this in functions):
use XML::LibXML qw(:all);
my $parser = XML::LibXML->new();
# this could also be parse_string
my $xmldom = eval { $parser->parse_file($file) };
if ($@) {
# problem parsing, die $@, etc.
}
# this is the outermost element
my $doc = $xmldom->documentElement;
# the rest might be familiar if you've used
# the DOM in JavaScript
# findnodes is very useful, but hard to find
# (look in XML::LibXML::Node); this assumes there's
# XML like <asset><story>...</story><story>...</story></asset>
my $xpath = 'asset/story'; # this can be whatever XPath
my @story_nodes = $doc->findnodes($xpath);
foreach my $story_node (@story_nodes) {
my $id = $story_node->getAttribute('ID');
# this is an example where DOM can be annoying
my ($uri_node) = $story_node->getChildrenByTagName('URI');
$uri_node->normalize();
my $uri = $uri_node->firstChild->getData(); # should 1st check i
+t's ::Text !
$uri =~ s/^(http://)[^.]+(\.example\.)/$1new$2/;
$uri_node->setData($uri;
# ....
}
# another thing not so obvious, this is in ::Document
# there are several variations, toFH, toString, etc..
$doc->toFile($newfile);
|