Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Using XML::DOM to get the contents of an element

by ezekiel (Pilgrim)
on Oct 02, 2002 at 03:57 UTC ( [id://202185]=perlquestion: print w/replies, xml ) Need Help??

ezekiel has asked for the wisdom of the Perl Monks concerning the following question:

I have this loop:

foreach my $node ($element->getChildNodes()) {}

Each $node represents a piece of XML of the form: <tag_name>Content</tag_name>

I want to extract the "Content" ie the character data between the start and end tags of each node in the loop above. There is only character data in each of these nodes.

My understanding of the docs is that the character data is stored in the only child node of the node representing the element. Hence I tried to extract it with:

my $child_nodes = $node->getChildNodes(); # get the child nodes my $child_node = $child_nodes->[0]; # get the first (only) child node my $content = $child_node->getData(); # get the content

but this dies with the error "Can't call method "getData" on an undefined value"

I'm sure this is simple, but I'm having one of those days and I just can't spot the error. Any suggestions? Thanks

Replies are listed 'Best First'.
Re: Using XML::DOM to get the contents of an element
by mirod (Canon) on Oct 02, 2002 at 06:25 UTC

    I believe getChildNodes() returns an array only if called in array context, otherwise it returns an object and you have to use $child_nodes->item( 0) .

    If I were you I would also test whether that first and only child exists: it could be empty, and you might want to report it as an error instead of just dying, or there might be entities in the text and you could get several children nodes even if there are no sub-elements.

    And finally, I would advise against using XML::DOM at all. The code written using the DOM is often very brittle: extra comments, or even line returns, tend to break it. In you case XML::Simple might be enough, or XML::LibXML, or (of course!) XML::Twig. See the recently updated Ways to Rome for examples of all of those modules (and more!)

(jeffa) Re: Using XML::DOM to get the contents of an element
by jeffa (Bishop) on Oct 02, 2002 at 06:30 UTC
    I am out of my league here, but i couldn't help trying to solve this problem. I came up with the following:
    foreach my $node ($doc->getChildNodes()) { next unless $node->getNodeType == ELEMENT_NODE; my $child_node = $node->getFirstChild(); my $content = $child_node->getData(); print $content,$/; }
    I think that the error you were getting was from the first iteration of the loop, in my case $node was an XML::DOM::DocumentType object and not an XML::DOM::Element object. Also, since you only need the first child, i believe you can use getFirstChild() instead of having to explicitly query an array index. Ughhh, i think i will take mirod's advice and avoid the DOM. ;)

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://202185]
Approved by fireartist
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (6)
As of 2024-04-23 12:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found