Re: XML::LibXML- Escape Empty Tags

This might help get you going. I took out the file stuff, you'll have to adjust. If this is something you actually need for work, you might consider posting it as a one-off job to jobs.perl.org or something.

use strict; # Don't leave out!
use warnings; # Don't leave out!
use XML::LibXML;

my $parser = XML::LibXML->new();
my $doc = $parser->parse_fh(\*DATA);
my @product = $doc->getElementsByTagName('product');

for my $kid ( @product ){
    print
        join("\t",
             $kid->getElementsByTagName('name')->[0]->textContent,
             $kid->getElementsByTagName('imageURL_med')->[0]->textCont
+ent,
             $kid->getAttribute('category_id'),
             $kid->getAttribute('id'),
             $kid->getElementsByTagName('desc_short')->[0]->textConten
+t,
             ),
        "\n";
}

# print $doc->serialize();

__END__
<root>
  <product category_id="13296" id="675936193" catalog="false" row="1">
    <name>Children's Hand Rake</name>
    <imageURL_med></imageURL_med>
    <desc_short>Mini gardeners can dig, rake and scoop out their own p
+lot with this children's hand rake, complete with contoured handles a
+nd durable metal heads.</desc_short>
  </product>
  <product category_id="13296" id="675936193" catalog="false" row="1">
    <name>Bag of Broken Glass</name>
    <imageURL_med>http://moocow.co.uk.jp/something/something/bg.jpg</i
+mageURL_med>
    <desc_short>Fun for all ages!</desc_short>
  </product>
</root>
[download]

Comment on Re: XML::LibXML- Escape Empty Tags Download Code

Replies are listed 'Best First'.
Re^2: XML::LibXML- Escape Empty Tags by khalistoo (Initiate) on Jul 31, 2009 at 08:46 UTC
Thanks a lot, this seems to work. However, can you explain to me two things, the line `# print $doc->serialize();` [download] As i ve got no idea about what it is suppose to actually do. and the use of `my $doc = $parser->parse_fh(\*DATA);`. I guess this is to work with filehandle but i was under the impression that parse_file was much better for big file manipulation (since i am in fact using to parse some 600 meg XML...), but then again, thanks a lot, that s going on very well. Cheers everyone for the help	[reply] [d/l] [select]
Re^3: XML::LibXML- Escape Empty Tags by Your Mother (Archbishop) on Jul 31, 2009 at 17:31 UTC
The `serialize` is there to uncomment if you want it to dump the doc to check. And you're right, doing the file directly (no filehandle) is probably faster. The *DATA handle is just easy to test/demo because it lets you put the data into the test script. Good luck. It is worth the effort to continue to pick up some Perl. It's not that hard, you'll get great help here and on many lists, and it can boost productivity in a menagerie of tasks.	[reply] [d/l]
Re^3: XML::LibXML- Escape Empty Tags by Anonymous Monk on Jul 31, 2009 at 09:18 UTC
It is a comment, it does nothing :D	[reply]