Re: Trees in XML
by GrandFather (Saint) on Jun 03, 2008 at 10:44 UTC
|
use strict;
use warnings;
use XML::TreeBuilder;
my $xml = <<XML;
<?xml version='1.0' encoding='UTF-8'?>
<list name="name list">
<person>
<firstname>Paul</firstname>
<lastname>Rutter</lastname>
<age>24</age>
</person>
<person>
<firstname>Ruth</firstname>
<lastname>Brewster</lastname>
<age>22</age>
</person>
<person>
<firstname>Cas</firstname>
<lastname>Creer</lastname>
<age>23</age>
</person>
</list>
XML
my $root = XML::TreeBuilder->new ();
$root->parse ($xml);
my @firstNames = map {$_->as_text ()} $root->look_down (_tag => 'first
+name');
print "TreeBuilder: @firstNames\n";
use XML::Twig;
my $twig = XML::Twig->new (twig_roots => { 'person/firstname' => \&p
+ushName});
@firstNames = ();
$twig->parse ($xml);
print "Twig: @firstNames\n";
sub pushName {
my ($t, $elt) = @_;
push @firstNames, $elt->text ();
}
Prints:
TreeBuilder: Paul Ruth Cas
Twig: Paul Ruth Cas
Perl is environmentally friendly - it saves trees
| [reply] [d/l] [select] |
|
|
| [reply] |
|
|
TreeBuilder and Twig are pure-perl modules, which use XML::Parser for their work.
| [reply] |
Re: Trees in XML
by rovf (Priest) on Jun 03, 2008 at 11:09 UTC
|
I would do it with XML::Simple; something like (code not tested!!):
use XML::Simple qw(:strict);
...
my $h=XMLin('your-xml-code-or-file-name-goes-here',
forcearray => [ qw(person) ], keyattr => []);
After this, $h->person[2]->{lastname} would return 'Creer'.
--
Ronald Fischer <ynnor@mm.st>
| [reply] [d/l] [select] |
|
|
Thanks
I have tried simple and found that it mangles the order of the XML when I put it into a hash. I am also experimenting with LibXML but as I said, I would really like to use XML::Parser tree style.
You see what I would like to do is put the returned value (firstnames) into an array which I can then use later, in another part of the perl script.
So any clues on how to use XML::Parser to obtain th firstnames?
Many thanks
| [reply] |
|
|
It is not XML::Simple which is mangling the order; it is the very property of a hash, that you don't have order there. That's the reason why I used forcearray in my example: This makes the person elements go into array instead of a hash, and staying in the same order as in the XML file.
BTW, does the order matter in your example?
--
Ronald Fischer <ynnor@mm.st>
| [reply] [d/l] [select] |
|
|
|
|
Re: Trees in XML
by toolic (Bishop) on Jun 03, 2008 at 16:38 UTC
|
The $tree - is it an array? a scalar? a variable?
ref can answer that for you:
print ref($tree), "\n";
outputs:
ARRAY
Or, if you read the free manual (XML::Parser), it says:
For elements, the content is an array reference.
Or, looking at the Dumper output:
$VAR1 = [
'list',
[
{
'name' => 'name list'
},
0,
'
',
the square bracket signifies an array ref.
how do I get access to it?
perlreftut is a good place to begin figuring out how to access Perl data structures such as this.
| [reply] [d/l] [select] |
Re: Trees in XML
by Jenda (Abbot) on Jun 03, 2008 at 13:51 UTC
|
use strict;
use XML::Rules;
my $rules = XML::Rules->new(
stripspaces => 7,
rules => {
_default => 'content',
person => sub {
# push the string we build to the array referenced by the
+{person}
# key in the paren tag's hash
return '@person' => "$_[1]->{firstname} $_[1]->{lastname}
+($_[1]->{age})"
},
list => sub {
# only interested in the person "attribute"
# due to the previous rule it's an arary ref
return $_[1]->{person};
# and this is what the $rules->parse() will return
}
}
);
my $people = $rules->parse(\*DATA);
use Data::Dumper;
print Dumper($people);
__DATA__
<?xml version='1.0' encoding='UTF-8'?>
<list name="name list">
<person>
<firstname>Paul</firstname>
<lastname>Rutter</lastname>
<age>24</age>
</person>
<person>
<firstname>Ruth</firstname>
<lastname>Brewster</lastname>
<age>22</age>
</person>
<person>
<firstname>Cas</firstname>
<lastname>Creer</lastname>
<age>23</age>
</person>
</list>
| [reply] [d/l] |
|
|
#!/usr/bin/perl
use strict;
use warnings;
use XML::Parser;
use Data::Dumper;
my $p = new XML::Parser( Style => 'Tree' );
my $inputfile = "testxml.xml";
my $tree = $p->parsefile($inputfile);
print $tree->[1]->[4]->[4]->[2], "\n";
which gives me 'Niall'
As I said, need to stick with XML::Parser!
| [reply] [d/l] |
|
|
As I said, need to stick with XML::Parser!
Not exactly. You seem to have convinced yourself that 1- you need to only use XML::Parser, 2- the Tree style is the simplest way to get what you want. It seems to me that 1 is false lazyness, and 2 is just misguided.
Learning how to use pure-perl modules, even on a machine where you don't have admin rights, would make it easier for you to write not only this piece of code, but also the next ones.
Your problem seems really adapted to a stream processing, whether it's using XML::Parser or an other module. Your code would be much more resistant to changes in the XML structure in the future: in your example [1]->[4]->[4]->[2] is effectively the hardcoded (and some would say obfuscated) path to your target element. If you don't want to hardcode it, you will end up re-writing code that's already written in the likes of XML::Twig, XML::XPath, XML::Rules... Meanwhile with a stream processing you would just process the firstname element, and leave the rest as is, thus you would be able to apply your code even if the input XML changes, as long as it still includes a firstname element.
That said, it's your code, you do what you want, just realize that you will get more help if you follow the general advice of using a better tool for the task.
| [reply] [d/l] |
|
|
If you can upload your script you can upload XML::Rules as well. It's pure Perl, a single file and the only dependencies are strict, warnings, Carp and XML::Parser::Expat. The first three are core, I do believe if you have XML::Parser you have the last one.
You can upload the Rules.pm into /some/path/you/have/access/to/lib/XML, add
use lib '/some/path/you/have/access/to/lib';
on top of your script and you are effectively done with the instalation.
| [reply] [d/l] |
|
|
| [reply] |