ianxharris has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I've got a problem whereby I need to compare two xml files for differences. I've already been through XML::SemanticDiff but it's not quite what I'm after as once a rogue element is introduced it reports the rest of the parent element's children as changed too (quite rightly).

I'm not really interested in structural changes as my concerns are around whether or not I want to send the file for translation or not. i.e. all I need to do is compare the two document trees and for any content nodes in file 2 (changed file) that do not exatcly match a (any) node in file 1 (original) mark them for translation.

My problem is that I can't find a way to make XML::Xpath expose the actual Xpath of the node it's processing.

eg.

my $xpath = XML::XPath->new( filename => $new_xml  );
my $nodeset = $xpath->find( "/" );
foreach my $node ( $nodeset->get_nodelist ) {
        print XML::XPath::XMLParser::as_string( $node ) . "\n";
}

Cheerfully prints the file back to the commandline. Does anyone know if there's a way to actually print the XPath of $node.

Greatful thanks for any advice.

  • Comment on XML::Xpath - Can I get the XPath of the current node?

Replies are listed 'Best First'.
Re: XML::Xpath - Can I get the XPath of the current node?
by Corion (Patriarch) on Feb 27, 2008 at 12:13 UTC

    My approach (not knowing that much about XPath) would be to walk up the tree/parent axis and build the path to the current node from that (I'm thinking in XML::LibXML syntax, sorry):

    sub get_path { my ($node) = @_; my $path = '';; my $p = $node; while (defined my $p = $p->parentNode) { $path = $p->nodeName . "/" . $path }; };

    ... but that method breaks down as soon as you have to nodes with the same tag. You will have to add the index of the current node then, which I don't know how to get.

    This whole method assumes that XML::Parser also has some of the methods the DOM/XML::LibXML has...

      If there is two+ nodes of the same name, you should be able to do a preceding sibling (or previous sibling) on the context node and count.