in reply to XML::LibXML- Escape Empty Tags

This might help get you going. I took out the file stuff, you'll have to adjust. If this is something you actually need for work, you might consider posting it as a one-off job to jobs.perl.org or something.

use strict; # Don't leave out! use warnings; # Don't leave out! use XML::LibXML; my $parser = XML::LibXML->new(); my $doc = $parser->parse_fh(\*DATA); my @product = $doc->getElementsByTagName('product'); for my $kid ( @product ){ print join("\t", $kid->getElementsByTagName('name')->[0]->textContent, $kid->getElementsByTagName('imageURL_med')->[0]->textCont +ent, $kid->getAttribute('category_id'), $kid->getAttribute('id'), $kid->getElementsByTagName('desc_short')->[0]->textConten +t, ), "\n"; } # print $doc->serialize(); __END__ <root> <product category_id="13296" id="675936193" catalog="false" row="1"> <name>Children's Hand Rake</name> <imageURL_med></imageURL_med> <desc_short>Mini gardeners can dig, rake and scoop out their own p +lot with this children's hand rake, complete with contoured handles a +nd durable metal heads.</desc_short> </product> <product category_id="13296" id="675936193" catalog="false" row="1"> <name>Bag of Broken Glass</name> <imageURL_med>http://moocow.co.uk.jp/something/something/bg.jpg</i +mageURL_med> <desc_short>Fun for all ages!</desc_short> </product> </root>

Replies are listed 'Best First'.
Re^2: XML::LibXML- Escape Empty Tags
by khalistoo (Initiate) on Jul 31, 2009 at 08:46 UTC
    Thanks a lot, this seems to work. However, can you explain to me two things, the line
    # print $doc->serialize();
    As i ve got no idea about what it is suppose to actually do. and the use of my $doc = $parser->parse_fh(\*DATA);. I guess this is to work with filehandle but i was under the impression that parse_file was much better for big file manipulation (since i am in fact using to parse some 600 meg XML...), but then again, thanks a lot, that s going on very well. Cheers everyone for the help

      The serialize is there to uncomment if you want it to dump the doc to check. And you're right, doing the file directly (no filehandle) is probably faster. The *DATA handle is just easy to test/demo because it lets you put the data into the test script. Good luck. It is worth the effort to continue to pick up some Perl. It's not that hard, you'll get great help here and on many lists, and it can boost productivity in a menagerie of tasks.

      It is a comment, it does nothing :D