Re^2: XML::LibXML getElementsById problem

Thanks for the tip. This snippet works! I'm using XML::LibXML to parse HTML docs. Unfortuantely it does not treat HTML ids like xml:id. I'm pretty new to XML. Thanks again.

use strict;
use XML::LibXML;
my $xml_string = <<EOF;
<?xml version="1.0"?> 
<root> 
  <aaa xml:id='test'> 
    <bbb/> 
  </aaa> 
</root> 
EOF
my $parser = XML::LibXML->new();
my $doc = $parser->parse_string($xml_string) || die;
my $elem = $doc->getElementsById('test');
print STDERR $elem."\n";
[download]

Comment on Re^2: XML::LibXML getElementsById problem Download Code

Replies are listed 'Best First'.
Re^3: XML::LibXML getElementsById problem by mirod (Canon) on Dec 15, 2005 at 16:39 UTC
Why don't you use a regular XPath expression instead of `getElementsById`? `my $elem = ($doc->findnodes('//*[@id="test"]'))[0];` works fine. It is probably slower than using `getElementsById` but it might not matter. Or you could select all elements with the attribute `id` and replace it by `xml:id`, and hope (I would think it works) that `getElementsById` then works. Or you could pre-process your HTML using `tidy` for example to get XHTML, and then use XML::LibXML on the XHTML (you might need to set the option to process the DTD in order for `id` to be recognized as an ID). There might also be an XML::LibXML specific trick for this, but I don't know the module that well.	[reply] [d/l]

Replies are listed 'Best First'.

Re^3: XML::LibXML getElementsById problem
by mirod (Canon) on Dec 15, 2005 at 16:39 UTC

Why don't you use a regular XPath expression instead of getElementsById? my $elem = ($doc->findnodes('//*[@id="test"]'))[0]; works fine. It is probably slower than using getElementsById but it might not matter. Or you could select all elements with the attribute id and replace it by xml:id, and hope (I would think it works) that getElementsById then works. Or you could pre-process your HTML using tidy for example to get XHTML, and then use XML::LibXML on the XHTML (you might need to set the option to process the DTD in order for id to be recognized as an ID).

There might also be an XML::LibXML specific trick for this, but I don't know the module that well.

[reply]
[d/l]