These handlers and roots in XML::Twig just keep getting me confused... Every time I re-read the CPAN page, I think I have it understood I then write code that either doesn't work, or pukes thousands of lines of the XML all concatenated at me. Here's a subsection of the XML that includes all relevant tags:
At first I was told to just get the <appDeploymentFile> and with some munging use it with <url> to produce a HASH value. Someone got me started in an earlier Seekers post with the following code:<?xml version="1.0" encoding="UTF-8"?> <authenticationReports> <generatedTime>Tue Sep 29 07:07:34 PDT 2009</generatedTime> <appDeploymentFile name="app-deployment.properties.hklcp.trading"> <application name="hk"> <urlInfo> <url>e/t/hk/accts_subscription</url> <otherPrereq>HKPwdPreReq</otherPrereq> </urlInfo> <urlInfo> <url>e/t/hk/accts_forms</url> <otherPrereq>HKPwdPreReq</otherPrereq> </urlInfo> <urlInfo> <url>e/t/hk/custtradingpage</url> <otherPrereq>BasicPrereq</otherPrereq> </urlInfo> <urlInfo> <url>e/t/hk/accts_userinfo</url> <otherPrereq>HKPwdPreReq</otherPrereq> </urlInfo> <urlInfo> <url>e/t/hk/headermain</url> </urlInfo> <urlInfo> <url>e/t/hk/custservicepage</url> </urlInfo> <urlInfo> <url>e/t/hk/accts_transfermoney</url> <otherPrereq>HKPwdPreReq</otherPrereq> </urlInfo> <urlInfo> <url>e/t/hk/userprereq</url> </urlInfo> <urlInfo> <url>e/t/hk/indices_us</url> </urlInfo> <urlInfo> <url>e/t/hk/homeloggedmessage</url> </urlInfo> <urlInfo> <url>e/t/hk/lead</url> </urlInfo> <urlInfo> <url>e/t/hk/orderviewmin</url> </urlInfo> <urlInfo> <url>e/t/hk/accts_changelogin</url> <otherPrereq>SessionPreReq</otherPrereq> </urlInfo> </application> <application name="intl"> <urlInfo> <url>e/t/intl/quotesandresearch</url> </urlInfo> <urlInfo> <url>e/t/intl/intltablesubnavviewcomponent</url> </urlInfo> <urlInfo> <url>e/t/intl/intltablemetaviewcomponent</url> </urlInfo> <urlInfo> <url>e/t/intl/disclaimer</url> </urlInfo> <urlInfo> <url>e/t/intl/headermain</url> </urlInfo> <urlInfo> <url>e/t/intl/indices_us</url> </urlInfo> <urlInfo> <url>e/t/intl/lead</url> </urlInfo> <urlInfo> <url>e/t/intl/selectlanguage</url> </urlInfo> <urlInfo> <url>e/t/intl/get-screen</url> <otherPrereq>BasicPrereq</otherPrereq> </urlInfo> <urlInfo> <url>e/t/intl/page_f</url> </urlInfo> <urlInfo> <url>e/t/intl/basicprereq</url> </urlInfo> <urlInfo> <url>e/t/intl/page</url> <otherPrereq>BasicPrereq</otherPrereq> </urlInfo> </application> </appDeploymentFile> </authenticationReports>
but that just gets me a hash with <url> as the key and my munged data in the value. It also doesn't work for the second (or any additional beyond the first) set of <application> tags. I need ALL of the data in a structure that's use-able for post processing. Even if I need to output it all to a CSV file and then reparse every entry at this point (45 hours and counting on this) line by line. I've used XML::Simple before, but this is more complex XML thna I ever parsed before for one, and it turns the <application> tag NAME into the attribute name, so no way to programmatic-ally get it. :( I am OK with adding the application value to the appDeploymentFile value and the url withe the OtherPrereq value and then handling appropriately in the hash later with a split or somesuch when I am ready to do things with it. Thanks in advance!#!/usr/bin/perl use strict; use warnings; use XML::Twig; use Data::Dumper; my $AFXML='xmlexample.xml'; #the hashes of the appropriate data files our %AFURLS; our %SMG = (); our %REALMS; our %SMCUST; # Read the XML from the maven plugin for AF that delineates the URL's sub AFXMLtoEM { print "Slurping $AFXML...."; my $TWIG = new XML::Twig ( twig_handlers => {'appDeploymentFile' = +> \&parseURL} ); #my $TWIG = new XML::Twig ( twig_handlers => {'appDeploymentFile/a +pplication' => \&parseURL} ); $TWIG -> parsefile ($AFXML) or die "Can't open $AFXML\n" ; #$TWIG->flush; # Now we want to change every value from the XML name to an EM ins +tance identifier #print Dumper(\%AFURLS); exit 1; while ((my $K, my $ITEM) = each %AFURLS) { my ($G1,$G2,$APP,$INST) = split /\./,$ITEM,4; unless ($APP eq "") { $ITEM = "prd:" . $APP . ":web:" . $INST; } #Cheesy kludge - fiox when Durai confirms $AFURLS{$K} = $ITEM; } print scalar keys %AFURLS, " records slurped in.\n"; } sub parseURL { my ($T, $ADEP) = @_; #print Dumper($T) ."," . Dumper($ADEP) ."\n"; my $NAME= $ADEP->att('name'); print $ADEP->first_child('application')->text() . "\n"; #print "$NAME\n"; for my $URLI ($ADEP->first_child('application')->children('urlInfo +')) { # for my $URLI ($ADEP->children('application')) { # $NAME2 = $ADEP->att('name'); #print "$NAME2\n"; # leading slash added for matching SM filters $AFURLS{ "/" . $URLI->first_child('url')->text() } = $NAME; # $AFURLS{ "/" . $URLI->first_child('urlInfo')->children('url') +->text() } = $NAME . "-" . $NAME2; } } # # Main program START # AFXMLtoEM(); print Dumper(\%AFURLS);
In reply to XML::Twig n00b by Binford
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |