Re: XML::LibXML getElementsById problem

There is nothing magical about an attribute named 'id'. You have to tell the system that it is of type... 'ID', either by using a DTD (you could probably also use a RelaxNG schema), or by using 'xml:id', which IS magical, instead of just 'id'.

Comment on Re: XML::LibXML getElementsById problem

Replies are listed 'Best First'.
Re^2: XML::LibXML getElementsById problem by pmc (Initiate) on Dec 15, 2005 at 16:25 UTC
Thanks for the tip. This snippet works! I'm using XML::LibXML to parse HTML docs. Unfortuantely it does not treat HTML ids like xml:id. I'm pretty new to XML. Thanks again. `use strict; use XML::LibXML; my $xml_string = <<EOF; <?xml version="1.0"?> <root> <aaa xml:id='test'> <bbb/> </aaa> </root> EOF my $parser = XML::LibXML->new(); my $doc = $parser->parse_string($xml_string) \|\| die; my $elem = $doc->getElementsById('test'); print STDERR $elem."\n";` [download]	[reply] [d/l]
Re^3: XML::LibXML getElementsById problem by mirod (Canon) on Dec 15, 2005 at 16:39 UTC
Why don't you use a regular XPath expression instead of `getElementsById`? `my $elem = ($doc->findnodes('//*[@id="test"]'))[0];` works fine. It is probably slower than using `getElementsById` but it might not matter. Or you could select all elements with the attribute `id` and replace it by `xml:id`, and hope (I would think it works) that `getElementsById` then works. Or you could pre-process your HTML using `tidy` for example to get XHTML, and then use XML::LibXML on the XHTML (you might need to set the option to process the DTD in order for `id` to be recognized as an ID). There might also be an XML::LibXML specific trick for this, but I don't know the module that well.	[reply] [d/l]