hacker has asked for the wisdom of the Perl Monks concerning the following question:

I'm parsing some generic XML from an application's "templates", and serving it out in a different format and also storing it in a database, and I'd like to triple-check my logic here. I'm not that good with the XML::Foo modules yet (like XML::DOM, XML::Parser, XML::RSS::Tools, etc.) but I'm getting there.

My input data looks like this:

<document> <stayonhost>1</stayonhost> <staybelow>1</staybelow> <maxdepth>2</maxdepth> <name>Swedish Foo</name> <home_url>http://www.foo.se/bar/</home_url> <category>International</category> [... up to 40 other element pairs..] </document>

Here's what I have thus far (featuring the snazzy new 2003 floor model PM indenting!):

use strict; # er, um, I forgot use DBI; # for storing in SQL use XML::DOM; # roll through the nodes # XML pre-requisites my $file = "funkyapp.xml"; my $xp = new XML::DOM::Parser(); my $doc = $xp->parsefile($file); my $root = $doc->getDocumentElement(); my @nodes = $root->getChildNodes(); foreach my $node (@nodes) { # get child nodes (yes, "childs") if ($node->getNodeType() == 1) { # check element name foreach my $item (@childs) { if ($node->getNodeType() == 1) { my @childs = $node->getChildNodes(); # iterate through child nodes foreach my $item (@childs) { # check element name ################################################# if (lc($item->getNodeName) eq "name") { my $name = $item->getFirstChild()->getData; ################################################# } elsif (lc($item->getNodeName) eq "home_url") { my $url = $item->getFirstChild()->getData; ################################################# } elsif (lc($item->getNodeName) eq "stayonhost") { my $stayhost = $item->getFirstChild()->getData; ################################################# } elsif (lc($item->getNodeName) eq "staybelow") { my $staybelow = $item->getFirstChild()->getData; ################################################# } elsif (lc($item->getNodeName) eq "maxdepth") { my $maxdepth = $item->getFirstChild()->getData; } # elsif { # [... 40 other nodes.. ] } # Insert values into SQL here, this works } }

This code works, and prints/stores the values I expect in the database, but it gets very repititious trying to add new nodes, and duplicating the code over and over.

The issue I'd like to solve here is that this input file could have about 40 different nodeName values. I'd like to eliminate the repetition of the code seen above to do this. Is there a simpler way to do that, and still give me the ability to print/store the values by name?

Update: added a description of my input data (thanks jeffa for the reminder)

Replies are listed 'Best First'.
Re: Rolling in the sheets with XML
by gjb (Vicar) on Apr 30, 2003 at 12:59 UTC

    Storing your data in a hash $data{url}$, $data{name}, ..., would solve the problem since you'd just have to write something along the lines of:

    $data{lc($item->getNodeName)} = $item->getFirstChild()->getData;

    Hope this helps, -gjb-

      This is neat, but how does he store the data from his file into a hash? I don't see that in his code.
(jeffa) Re: Rolling in the sheets with XML
by jeffa (Bishop) on Apr 30, 2003 at 17:59 UTC
    Ahhh, i see that working with the dreaded DOM has already got you thinking like a Java coder who has never been introduced to java.util.Hashtable. ;) What i mean is that you are not being generic enough. Check out this code which uses XML::Simple:
    use strict; use warnings; use DBI; use XML::Simple; my $dbh = DBI->connect( ... ); my $xml = XMLin(\*DATA, keyattr => 'document'); insert($_) for @{$xml->{document}}; sub insert { my $hash = shift; my $sth = $dbh->prepare( 'insert into bar (' . join(',', keys %$hash) . ') values (' . join(',',map '?',keys %$hash) . ')' ); $sth->execute(values %$hash); } __DATA__ <documents> <document> <stayonhost>1</stayonhost> <staybelow>1</staybelow> <maxdepth>2</maxdepth> <name>Swedish Foo</name> <home_url>http://www.foo.se/bar/</home_url> <category>International</category> </document> <document> <stayonhost>0</stayonhost> <staybelow>0</staybelow> <maxdepth>3</maxdepth> <name>Swedish Bar</name> <home_url>http://www.bar.se/baz/</home_url> <category>National</category> </document> </documents>
    Piece of cake. I can't help but feel that i am reinventing a wheel with my insert() sub however ...

    UPDATE: May 2
    Ahhh, how about a version with DBI::Wrap?

    use DBI::Wrap; use XML::Simple; my $dbh = DBI::Wrap->new( ... ); $dbh->table('hacker'); my $xml = XMLin(\*DATA, keyattr => 'document'); $dbh->insert(Values => $_) for @{$xml->{document}}; __DATA__ (insert data from above snippet)

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)