Thanks for the kick in the butt.
I have spent most of my evening on this, and have gotten good results with XML::DOM.
It is able to parse my document and pull together the information I need in a recursive loop ...
sub printNode
{
my ($theNode) = @_;
my $thisType = $theNode->getNodeType;
my $nodeList = $theNode->getChildNodes;
my $name = $theNode->getNodeName;
my $attLength;
my $length = $nodeList->getLength;
my $attList = $theNode->getAttributes;
if( $attList ){ $attLength = $attList->getLength };
#print $sep x ($depth), "-NodeName: '$name' (type $thisType, $leng
+th children)\n";
$depth++;
for(my $i=0; $i<$length; $i++)
{
my $node = $nodeList->item($i);
my $theType =$node->getNodeType;
my $j=$i+1;
if ($theType == ELEMENT_NODE )
{
#print $sep x ($depth), "[$j]Element Node: '", $node->getT
+agName, "'\n";
#Recursive call to itself
printNode( $node );
}
elsif ($theType == TEXT_NODE )
{
my $temp = $node->getData;
#Sub out the tabs and newlines with text equivalents
$temp =~ s/\n/\\n/g;
$temp =~ s/\t/\\t/g;
unless ($name eq "provinces") {
print $sep x ($depth), $name.qq~="~.$temp.qq~"\n~;
}
}
else
{
print $sep x ($depth), "[$j]UNKNOWN Node: ", $node->getNod
+eName, "\n";
}
}
$depth--;
}
I get the following output
path="smallbusiness"
en_title="Small Business"
fr_title="Petites Entreprises"
hideFromMenu="false"
hideBreadcrumbs="false"
path="products"
en_title="Products and Services"
fr_title="Produits et Services"
hideFromMenu="false"
hideBreadcrumbs="false"
path="wireless"
en_title="Wireless"
fr_title="Sans fil"
hideFromMenu="false"
hideBreadcrumbs="false"
path="devices"
en_title="Phones and Devices"
fr_title="Tlphones et Appareils"
portalPageLabel="smb_products_services_wir
+eless_devices"
hideFromMenu="false"
hideBreadcrumbs="false"
path="details"
en_title="Details"
fr_title="Dtails"
portalPageLabel="smb_products_services
+_wireless_devices_details"
hideFromMenu="true"
hideBreadcrumbs="true"
path="plans"
en_title="Plans"
fr_title="Forfaits"
portalPageLabel="smb_products_services_wir
+eless_plans"
hideFromMenu="false"
hideBreadcrumbs="false"
The last piece is being able to have the output in XML.
the start of each node (i.e. "path" should be prepended with <menuitem and the end of each node (i.e. hideBreadcrumbs should end with ">" *or* "/>".
Depending if it has children the XML tag should be left open, if no children, the XML tag should be closed.
I can't seem to insert these tags within the recursive logic without having many repeating tags.
Thanks a lot for any assistance you might be able to provide. |